exploiting distributed version concurrency in a transactional memory cluster kaloian manassiev,...

53
Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto, Canada

Upload: tyrell-bowdle

Post on 14-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Exploiting Distributed Version Concurrency in a Transactional Memory

Cluster

Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza

University of Toronto, Canada

Page 2: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Transactional Memory Programming ParadigmEach thread executing a parallel region: Announces start of a transaction Executes operations on shared objects Attempts to commit the transaction

If no data race, commit succeeds, operations take effect

Otherwise commit fails, operations discarded, transaction restarted

Simpler than locking!

Page 3: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Transactional Memory Used in multiprocessor platforms

Our work: the first TM implementation on a cluster Supports both SQL and parallel scientific

applications (C++)

Page 4: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

TM in a Multiprocessor Node

Multiple physical copies of data High memory overhead

A

Copy of A

T1: Read(A)

T2: Write(A)

T1: ActiveT2: Active

Page 5: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

TM on a ClusterKey Idea 1. Distributed Versions Different versions of data arise

naturally in a cluster Create new version on different

node, others read own versions

write read readread

Page 6: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Exploiting Distributed Page Versions

mem0

txn0

mem1

txn1

mem2

txn2

memN

txnN

network

...

Distributed Transactional Memory (DTM)

v3 v2 v1 v0

Page 7: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Key Idea 2: Concurrent “Snapshots” Inside Each Node

read

v1 v1 v2 v2 v2

v2

Txn0 (v1) Txn1 (v2)

Page 8: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Key Idea 2: Concurrent “Snapshots” Inside Each Node

read

v1 v1 v2 v2 v2

v2

Txn0 (v1) Txn1 (v2)

v1 v1 v2 v2 v2

v2

Page 9: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Key Idea 2: Concurrent “Snapshots” Inside Each Node

read

v1 v1 v2 v2 v2

v2

Txn0 (v1) Txn1 (v2)

v1 v1 v1 v2 v2

v2

Page 10: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Distributed Transactional Memory

A novel fine-grained distributed concurrency control algorithm

Low memory overhead Exploits distributed versions Supports multithreading within the node Provides 1-copy serializability

Page 11: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Outline Programming Interface Design

Data access tracking Data replication Conflict resolution

Experiments Related work and Conclusions

Page 12: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Programming Interface init_transactions() begin_transaction() allocate_dtmemory() commit_transaction()

Need to declare TM variables explicitly

Page 13: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Data Access Tracking DTM traps reads and writes to shared

memory by either one of:

Virtual memory protection Classic page-level memory protection

technique

Operator overloading in C++ Trapping reads: conversion operator Trapping writes: assignment ops (=, +=, …)

& increment/decrement(++/--)

Page 14: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Data Replication

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Page 15: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Twin Creation

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Wr p1P1 Twin

Page 16: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Twin Creation

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Wr p2P1 Twin

P2 Twin

Page 17: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Diff Creation

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Page 18: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Broadcast of the Modifications at Commit

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Diff broadcast (vers 8)

Latest Version = 7 Latest Version = 7

v2 v1

v1

Page 19: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Other Nodes Enqueue Diffs

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

Diff broadcast (vers 8) v2 v1v8

v8 v1

Latest Version = 7 Latest Version = 7

Page 20: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Update Latest Version

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

v2 v1v8

v8 v1

Latest Version = 7 Latest Version = 8

Page 21: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Other Nodes Acknowledge Receipt

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

v2 v1

v8 v1

Ack (vers 8)

v8

Latest Version = 7 Latest Version = 8

Page 22: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

T1 Commits

……

Page 1

Page 2

Page n

T1(UPDATE)

……

Page 1

Page 2

Page n

v2 v1

v8 v1

v8

Latest Version = 8 Latest Version = 8

Page 23: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Lazy Diff Application

.

.

.

Page 1 V0

Page 2 V0

V8 V1

Page N V3

V5 V4

T2(V2):Rd(…, P1, P2)

Latest Version = 8

V2 V1V8

Page 24: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Lazy Diff Application

.

.

.

Page 1

Page 2 V0

Page N V3

V5 V4

V8

V2

V8 V1

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Page 25: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Lazy Diff Application

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V3

V5 V4

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Page 26: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Lazy Diff Application

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V3

V5 V4T3(V8):Rd(PN)

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Page 27: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Lazy Diff Application

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V5T3(V8):Rd(PN)

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Page 28: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Waiting Due to Conflict

T3(V8):Rd(PN, P2)

.

.

.

Page 1 V2

V8

Page 2 V1

V8

Page N V5

T2(V2):Rd(…, P1, P2)

Wait until T2 commits

Latest Version = 8

Page 29: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Transaction Abort Due to Conflict

.

.

.

Page 1

Page 2 V0

Page N V3

V5 V4

V8

V2

V8 V1

T3(V8):Rd(P2)

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Page 30: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Transaction Abort Due to Conflict

.

.

.

Page 1

Page 2 V8

Page N V3

V5 V4

V8

V2

T3(V8):Rd(P2)

CONFLICT!

T2(V2):Rd(…, P1, P2)

Latest Version = 8

Page 31: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Write-Write Conflict Resolution Can be done in two ways

Executing all updates on a master node, which enforces serialization order

OR Aborting the local update transaction upon

receiving a conflicting diff flush

More on this in the paper

Page 32: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Experimental Platform Cluster of Dual AMD Athlon Computers

512 MB RAM 1.5GHz CPUs RedHat Fedora Linux OS

Page 33: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Benchmarks for Experiments TPC-W e-commerce benchmark

Models an on-line book store Industry-standard workload mixes

Browsing (5% updates) Shopping (20% updates) Ordering (50% updates)

Database size of ~600MB

Hash-table micro-benchmark (in paper)

Page 34: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Application of DTM for E-Commerce

Web Server

The Internet

Customer

App Server

HTTP RPC SQL

Customer

HTTP

Customer

HTTP

HTTP

Web Server

Web Server

App Server

App Server

DATABASE

Page 35: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Application of DTM for E-Commerce

We use a Transactional Memory Cluster as the DB Tier

Web Server

The Internet

Customer

App Server

HTTP RPC SQL

Customer

HTTP

Customer HTTP

HTTP

Web Server

Web Server

App Server

App Server

DB Server

DB Server

DB Server

Page 36: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Cluster Architecture

MySQL In-memory Tier

Master Slave Slave SlaveSlave

Scheduler

MMAP On-disk Database MMAP On-disk Database

Page 37: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Implementation Details We use MySQL’s in-memory HEAP

tables RB-Tree main-memory index No transactional properties

Provided by inserting TM calls

Multiple threads running on each node

Page 38: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Baseline for Comparison State-of-the-art Conflict-aware

protocol for scaling e-commerce on clusters Coarse grained (per-table) concurrency

control

(USITS’03, Middleware’03)

Page 39: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Throughput Scaling

0

50

100

150

200

250

300

350

0 1 2 3 4 5 6 7 8

# of Slave Replicas

Th

rou

gh

pu

t (W

IPS

)

Ordering Shopping Browsing

Page 40: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Fraction of Aborted Transactions

# of slaves Ordering Shopping Browsing

1 1.15% 1.44% 0.63%

2 0.35% 2.27% 1.34%

4 0.07% 1.70% 2.37%

6 0.02% 0.41% 2.07%

8 0.00% 0.22% 1.59%

Page 41: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Comparison (browsing)

0

50

100

150

200

250

300

350

0 2 4 6 8

Number of Replicas

Th

rou

gh

pu

t (W

IPS

)

Conflict-Aware DTM

Page 42: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Comparison (shopping)

0

50

100

150

200

250

300

350

0 2 4 6 8

Number of Replicas

Th

rou

gh

pu

t (W

IPS

)

Conflict-Aware DTM

Page 43: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Comparison (ordering)

0

20

40

60

80

100

120

140

160

180

200

0 2 4 6 8

Number of Replicas

Th

rou

gh

pu

t (W

IPS

)

Conflict-Aware DTM

Page 44: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Related Work Distributed concurrency control for database

applications Postgres-R(SI), Wu and Kemme (ICDE’05) Ganymed, Plattner and Alonso (Middleware’04)

Distributed object stores Argus (’83), QuickStore (’94), OOPSLA’03

Distributed Shared Memory TreadMarks, Keleher et al. (USENIX’94) Tang et al. (IPDPS’04)

Page 45: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Conclusions New software-only transactional memory

scheme on a cluster Both strong consistency and scaling

Fine-grained distributed concurrency control Exploits distributed versions, low memory

overheads Improved throughput scaling for e-

commerce web sites

Page 46: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Questions?

Page 47: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Backup slides

Page 48: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Example Program#include <dtm_types.h>typedef struct Point {

dtm_int x;dtm_int y;

} Point;init_transactions();for (int i = 0; i < 10; i++) {

begin_transaction();Point * p = allocate_dtmemory();p->x = rand();p->y = rand();

commit_transaction();}

Page 49: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Query weights

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

OrdIdx(0.35)

ShpIdx(0.1)

BrwIdx(0.03)

Ord,NoIdx(0.26)

Shp,NoIdx(0.07)

Brw,NoIdx(0.02)

Writes

Reads

Page 50: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Decreasing the fraction of aborts

1.34%

2.34% 2.37%

2.68%

2.07%

2.83%

1.59%

1.34%

0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

M + 2S M + 2S,Confl.

Reduce

M + 4S M + 4S,Confl.

Reduce

M + 6S M + 6S,Confl.

Reduce

M + 8S M + 8S,Confl.

Reduce

Fra

cti

on

of

Ab

ort

s

Page 51: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Micro benchmark experiments

0

200

400

600

800

1000

1200

1 2 3 4 5 6 7 8 9 10

number of machines

Th

rou

gh

pu

t (

x 10

00 )

1% 5% 10% 15% 20%

Page 52: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Micro benchmark experiments (with read-only optimization)

0

100

200

300

400

500

1 2 3 4 5 6 7 8 9 10

number of machines

Th

rou

gh

pu

t (

x 10

00 )

R/O Opt Base

Page 53: Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Fraction of aborts

# of machines 1 2 4 6 8 10

% aborts 0 0.57 1.69 2.94 4.05 5.08