window-based greedy contention management for transactional memory gokarna sharma (lsu) brett...
TRANSCRIPT
1
Window-Based Greedy Contention Management for Transactional Memory
Gokarna Sharma (LSU)Brett Estrade (Univ. of Houston)
Costas Busch (LSU)
DISC 2010 - 24th International Symposium on Distributed Computing
2
Transactional Memory - Background
• The emergence of multi-core architectures– Opportunities and challenges
• How to handle access to shared data?– Locks, Monitors, …
• Transactional memory (TM) is an alternative synchronization abstraction– Simple, composable, …
• Three types – Hardware, Software, and Hybrid TMs– Our focus is on STM Systems
DISC 2010 - 24th International Symposium on Distributed Computing
3
STM Systems• Progress is ensured through contention management (CM)
policy
• If transactions modify different data– everything is OK
• If transactions modify same data– conflicts arise that must be resolved - job of a contention
management policy
• Of particular interest are greedy contention managers– Transactions immediately restart after every abort
DISC 2010 - 24th International Symposium on Distributed Computing
4
Prior Work• Mostly empirical evaluation
• Theoretical Analysis– [Guerraoui et al., PODC’05]
• Greedy Contention Manager • Competitive ratio = O(s2) (s is the number of shared resources)
– [Attiya et al., PODC’06]• Improved to O(s)
– [Schneider & Wattenhofer, ISAAC’09]• RandomizedRounds Contention Manager• Competitive ratio = O(C . log n) (C is the maximum number of conflicting transactions and
n is the number of transactions)
– [Attiya & Milani, OPODIS’09]• Bimodal Scheduler• Competitive ratio = O(s) (for bimodal workload with equi-length transactions)DISC 2010 - 24th International Symposium on Distributed Computing
5
Our Contributions• Execution window model for TM
• Makespan bound of any CM algorithm based on the contention measure C with in the window and the window parameters M and N
• Two new randomized contention management algorithms that are very close to O(s)-competitive
• An adaptive version that adapts to the amount of contention C
1 2 3 N
N
M
1 2 3
M
Transactions. . .
Threads .
..
DISC 2010 - 24th International Symposium on Distributed Computing
6
Roadmap
• Previous TM models and problem complexity
• Our TM model
• Our algorithms and proof ideas
DISC 2010 - 24th International Symposium on Distributed Computing
7
Previous TM Models
• One-shot scheduling problem– n transactions, a single transaction per thread– Best bound proven to be achievable is O(s)
• Problem Complexity: directly related to vertex coloring– Coloring problem -> One-shot scheduling problem -> One-shot
scheduling Solution -> Coloring Solution
• NP-Hard to approximate an optimal vertex coloring
• Can we do better under the limitations of coloring reduction? DISC 2010 - 24th International Symposium on Distributed Computing
8
Execution Window Model
• A M N window W – M threads with a sequence of N transactions per thread,
i.e., collection of N one-shot transaction sets
1 2 3 N
N
M
1
2 3
M
Transactions
.
.
.
. . .
Threads
DISC 2010 - 24th International Symposium on Distributed Computing
9
Makespan Bounds• Let C denote the maximum number of conflicting transactions for any transaction
inside the window
• Trivial Makespan Bounds:– Straightforward upper bound: τ . min(CN,MN), where τ is the execution time
duration– One-shot analysis bound [Attiya et al., PODC’06]: O(sN)– Using RandomizedRounds [Schneider & Wattenhofer, ISAAC’09] N times, makespan bound:
O(τ . CN logM)
• Our Bounds: – Offline-Greedy: Makespan bound = O(τ . (C + N log(MN))) and
Competitive Ratio = O(s + log(MN)) with high probability– Online-Greedy: Makespan bound = O(τ . (C log(MN) + N log2(MN))) and
Competitive Ratio = O(s . log(MN) + log2(MN)) high probabilityDISC 2010 - 24th International Symposium on Distributed Computing
10
Intuition
• The random delays help conflicting transactions shift inside the window and their execution time may not coincide
• More apparent in scenarios where conflicts are more frequent inside the same column transactions and less frequent in different column transactions
N
N’
Random interval
1 2 3 N
M
1 2 3 N
N
M
. . .
DISC 2010 - 24th International Symposium on Distributed Computing
11
How it works? (1/2)• Random intervals: Assume each thread Pi knows Ci and each
transaction has same duration τ (this assumption can be removed)
• Conflicts: Divide time steps into frames [each time step is of size τ]
– Frame size depends on the conflict resolution strategy of the algorithm
• Number of frames in random intervals: Each thread chooses a random number qi independently, uniformly, and randomly from the range [0, αi -1], where αi = Ci / log(MN)
• Handling conflicts: Use prioritiesDISC 2010 - 24th International Symposium on Distributed Computing
12
How it works? (2/2)
1 2 3 N
M
N
q1 ϵ [0, α1 -1], α1 = C1 / log(MN)
Frames
C=maxi Ci, 1 ≤ i ≤ M
F11 F3N
Thread 1
Thread 2
Thread 3
Thread M
F1N F12
Makespan = (C / log(MN) + Number of frames) Frame Size ( =C / log(MN) + N )Frame Size
First frame of Thread 1 where T11 executesSecond frame of Thread 1 where T12 executes
DISC 2010 - 24th International Symposium on Distributed Computing
13
Offline-Greedy Algorithm (1/2)• Initialization:
– Frames are of size Φ = Θ(τ . ln(MN)) time steps – Each thread Pi is assigned initially a random period of qi ϵ [0, αi-1] frames, αi
= Ci / log(MN)
– Each transaction Tij is assigned to frame Fij = qi + (j-1)
• Priority assignment: each transaction has two priorities: low or high– Transaction Tij is initially in low priority
– Tij switches to high priority in the first time step of frame Fij and remains in high priority thereafter
• Conflict resolution: uses conflict graph explicitly to resolve conflicts– Conflict graph is dynamic and evolves while the execution of the
transactions progresses DISC 2010 - 24th International Symposium on Distributed Computing
Offline-Greedy Algorithm (2/2)• Proof Intuition: With high probability each transaction commits in its
assigned frame– Let A’ A denote the subset of conflicting transactions with Tij in frame Fij
• |A’| log(MN) – 1, then Tij commits in frame Fij
• |A’| log(MN) with probability at most (1/MN)2
• Makespan: O( (C + N log(MN))) with high probability– Pro: For C N log(MN) makespan is log(MN) factor far from optimal, since N is a
trivial lower bound– Con: Need to know dependency graph to resolve conflicts
• Competitive ratio: O(s + log(MN)) with high probability– Pro: Independent with any choice of C 14DISC 2010 - 24th International Symposium on Distributed Computing
15
Online-Greedy Algorithm (1/2)• Online in the sense that it does not depend on knowing the
dependency graph to resolve conflicts
• Similar to Offline-Greedy except the conflict resolution strategy
• Priority assignment– Two different priorities associated with each transaction as a vector – π(1) represent the Boolean priority as in Offline-Greedy– π(2) [1, M] represent random priorities: A transaction chooses π(2) uniformly
at random on the start of frame Fij and after every abort [Idea from Schneider &
Wattenhofer, ISAAC’09]
• Conflict resolution– On conflict of Tij with Tkl: if πij
(2) < πkl(2) then abort(Tij, Tkl) otherwise abort(Tkl,
Tij) DISC 2010 - 24th International Symposium on Distributed Computing
16
Online-Greedy Algorithm (2/2)• Proof Intuition: frame duration is now Φ’ = O( log2(MN))
– Analysis is similar to Offline-Greedy
• Makespan: O((C log(MN) + N log2(MN))) with high probability– Pro: no need to know dependency graph to resolve conflicts– Con: makespan is worse in comparison to Offline-Greedy
• Competitive ratio: O(s log(MN) + log2(MN)) with high probability
• Pro: Independent of the contention measure CDISC 2010 - 24th International Symposium on Distributed Computing
17
Adaptive-Greedy Algorithm• Limitations of Offline-Greedy and Online-Greedy
algorithms– The values of Ci need to be known in advance
• Adaptive-Greedy: each thread starts with guessing Ci = 1– Similar to the exponential back-off strategy used by Polka– Based on current Ci estimate, the thread attempts to execute Online-
Greedy algorithm– If a thread Pi is unable to commit transactions (bad event) then Pi
assumes choice of Ci is incorrect and starts over again by assuming Ci’ = 2 Ci for remaining transactions
• Correct choice of Ci is reached in logCi iterationsDISC 2010 - 24th International Symposium on Distributed Computing
18
Discussions• For variable length transactions
– on makespan bounds is replaced with max, which is the maximum duration of any transaction in the window
– max / min factor in competitive ratio bounds, where min is the minimum duration of any transaction in the window
• Future extensions– Instead of one randomization interval at the beginning of window, random
periods of low priority between subsequent transactions
– Dynamic expansion and contraction of the execution window to preserve the contention measure C
DISC 2010 - 24th International Symposium on Distributed Computing
19
Conclusions
• Execution window model for TM
• Two new randomized greedy CM algorithms that are very close to O(s)-competitive
• Adaptive version of the previous algorithms for better performance by avoiding the limitations of the known value of C
DISC 2010 - 24th International Symposium on Distributed Computing