deuce stm - cmp'09

45
Noninvasive Java Concurrency with Deuce STM 1.0 Guy Korland “Multi Core Tools” CMP09

Upload: guy-korland

Post on 11-May-2015

2.003 views

Category:

Technology


0 download

DESCRIPTION

Deuce STM - CMP'09

TRANSCRIPT

Page 1: Deuce STM - CMP'09

Noninvasive Java Concurrency with Deuce STM 1.0

Guy Korland “Multi Core Tools” CMP09

Page 2: Deuce STM - CMP'09

Outline

• Motivation• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 3: Deuce STM - CMP'09

Motivation

Page 4: Deuce STM - CMP'09

Problem I

Process 1 Process 2

a = acc.get()   a = a + 100 b = acc.get() 

b = b + 50 acc.set(b)

acc.set(a)

... Lost Update! ...  

Page 5: Deuce STM - CMP'09

Problem II

Process 1 Process2

lock(A) lock(B)lock(B) lock(A)

... Deadlock! ...

Page 6: Deuce STM - CMP'09

• Cannot exploit cheap threads

• Today’s Software o Non-scalable methodologies

• Today’s Hardwareo Poor support for scalable synchronization.o Low level support CAS, TAS, MemBar…

The Problem

Page 7: Deuce STM - CMP'09

The Problem

Page 8: Deuce STM - CMP'09

Why Locking Doesn’t Scale?

• Not Robust

• Relies on conventions

• Hard to Useo Conservativeo Deadlockso Lost wake-ups

• Not Composable

Page 9: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce • Implementation• TL2• LSA• Benchmarks• Summary• References

Page 10: Deuce STM - CMP'09

Solutions I – Domain specific

•Mathlab – Concurrency behind the scenes. •SQL/XQuery/XPath – DB will handle it… •HTML, ASP, PHP, JSP … – (almost) stateless.

•Fortress[Sun], X10[IBM], Chapel[UW] … – implicit concurrency. Remember Cobol!

Domain too specific

Page 11: Deuce STM - CMP'09

Solutions II – Actor Model(Share nothing model)

•Carl Hewitt, Peter Bishop and Richard, A Universal Modular Actor Formalism for Artificial Intelligence [IJCAI 1973].

•An actor, on message:– no shared data– send messages to other actors– create new actors

•Where can we find it?– Simula, Smalltalk, Scala, Haskell, F#, Erlang...

Functional languges

Page 12: Deuce STM - CMP'09

Solutions II – Actor Model(Share nothing model)

-module(counter).-export([run/0, counter/1]).    run() ->    S = spawn(counter, counter, [0]),    send_msgs(S, 100000),    S. counter(Sum) ->    receive        {inc, Amount} -> counter(Sum+Amount)    end.

send_msgs(_, 0) -> true;send_msgs(S, Count) ->    S ! {inc, 1}, send_msgs(S, Count-1). 

Actors in Erlang

•Is it really easier?

•What about performance?

•Will functional languages

ever be functional?

•Java/.NET/C++ rules!!!

(maybe Ruby)

Page 13: Deuce STM - CMP'09

Solutions III – STM Nir Shavit, DAN TOUITOU, Software Transactional Memory [PODC95]

synchronized{ <instructions>}

atomic{ <instructions>}

l.lock(); <instructions>l.unlock();

Page 14: Deuce STM - CMP'09

What is a transaction?

• Atomicity – all or nothing

• Consistency – consistent state (after & before)

• Isolation – Other can’t see intermediate.

• Durability - persistent

Or maybe we do want it?

Page 15: Deuce STM - CMP'09

The Brief History of STM

1993

STM

(Sha

vit,T

ouito

u)20

03D

STM

(Her

lihy

et a

l)

2003

WS

TM (F

rase

r, H

arris

)

2003

OS

TM (F

rase

r, H

arris

)

2004

AS

TM (M

arat

he e

t al)

2004

T-M

onito

r (Ja

gann

atha

n…)

2005

Lock

-OS

TM (E

nnal

s)

2004

Hyb

ridTM

(Moi

r)

2004

Met

a Tr

ans

(Her

lihy,

Sha

vit)

2005

McT

M (S

aha

et a

l)

2006

Ato

mJa

va (H

indm

an…

)

1997

Tran

s S

uppo

rt TM

(Moi

r)

2005

TL (D

ice,

Sha

vit))

2004

Sof

t Tra

ns (A

nani

an, R

inar

d)

2006

LSA

(R

iege

l et a

l

2006

TL2

(Dic

e, S

havi

t, S

hale

v)20

09D

euce

(Kor

land

et a

l)

2008

Roc

k (S

un)

2006

DS

TM2

(Her

lihy,

Luc

hang

co)

2007

Tang

er

Page 16: Deuce STM - CMP'09

DSTM2Maurice Herlihy et al, A flexible framework … [OOPSLA06]

@atomic public interface INode{int getValue ();void setValue (int value );INode getNext ();void setNext (INode value );

}Factory<INode> factory = Thread.makeFactory(INode.class );

result = Thread.doIt(new Callable<Boolean>() { public Boolean call () {

return intSet.insert (value); }

});

•Limited to Objects.

•Very intrusive.

•Doesn’t support libraries.

•Bad performance (fork).

Page 17: Deuce STM - CMP'09

JVSTMJoão Cachopo and António Rito-Silva, Versioned boxes as the basis for memory transactions [SCOOL05]

public class Account{

private VBox<Long> balance = new VBox<Long>();

public @Atomic void withdraw(long amount) { balance.put (balance.get() - amount); }

}

•Doesn’t support libraries.

•Less intrusive.

•Need to “Announce” shared fields

Page 18: Deuce STM - CMP'09

Atom-JavaB. Hindman and D. Grossman. Atomicity via source-tosourcetranslation. [MSPC06]

public void update ( double value){

Atomic{

commission += value;

}

}

•Add a reserved word.

•Need precompilation.

•Doesn’t support libraries.

•Even Less intrusive.

Page 19: Deuce STM - CMP'09

MultiversePeter Veentjer, 2009

@TmEntitypublic class Stack<E>{

private Node<E> head;

public void push(E item) {    head = new Node(item, head); } }

@TmEntity  public static class Node<E> {        final E value;        final Node parent;

        Node(E value, Node prev) {            this.value = value;            this.parent = prev;        }    }

•Doesn’t support libraries.

•Limited to Objects.

Page 20: Deuce STM - CMP'09

DATM-JHany E. Ramadan et al., Dependence-aware transactional memory [MICRO08]

Transaction tx = new Transaction ( id) ;

boolean done = false;

while ( !done) {

try{

tx.BeginTransaction( ) ;

/ / txnl code

done = tx.CommitTransaction ( ) ;

} catch( AbortException e ) {

tx.AbortTransaction( ) ;

done = false;

}

}

•Explicit transaction.

•Explicit retry.

Page 21: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 22: Deuce STM - CMP'09

Deuce STM

• Java STM frameworko @Atomic methodso Field based access

More scalable than Object bases. More efficient than word based.

o Supports external libraries Can be part of a transaction

o No reserved words No need for new compilers (Existing IDEs can be used)

• Research toolo API for developing and testing new algorithms.

Page 23: Deuce STM - CMP'09

Deuce - API

public class Bank{

final private static double MAXIMUM_TRANSACTION = 1000; private double commission = 0;

@Atomic(retries=64) public void transaction( Account ac1, Account ac2, double amount){ ac1.balance -= (amount + commission); ac2.balance += amount; }

@Atomic public void update( double value){ commission += value; }}

Page 24: Deuce STM - CMP'09

Deuce - Overview

Page 25: Deuce STM - CMP'09

Deuce - Running

• –javaagent:deuceAgent.jar o Dynamic bytecode manipulation.

• -Xbootclasspath/p:rt.jaro Offline instrumentation to support boot classloader.

• java –javaagent:deuceAgent.jar –cp “myjar.jar” MyMain

Page 26: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 27: Deuce STM - CMP'09

Implementation

• ASM – Bytecode manipulationo Online & Offline

• Fields o private double commission;o final static public long commission__ADDRESS...

Relative address (-1 if final).o final static public Object __CLASS_BASE__ ...

Mark the class base for static fields access.

Page 28: Deuce STM - CMP'09

Implementation

• Method o @Atomic methods.

Replace the with a transaction retry loop. Add another instrumented method.

o Non-Atomic methods Duplicate each with an instrumented version.

Page 29: Deuce STM - CMP'09

Implementation

@Atomicpublic void update ( double value){ double tmp = commission; commission = tmp + value;}

@Atomicpublic void update ( double value){ commission += value;}

In byte code

Page 30: Deuce STM - CMP'09

public void update( double value, Context c){ double tmp; if( commission__ADDRESS < 0 ) { // final field tmp = commission; } else{ c.beforeRead( this, commission__ADDRESS); tmp = c.onRead( this, commission,

commission__ADDRESS); } c.onWrite( this, tmp + value, commission__ADDRESS);}

Implementation

JIT removes it

Page 31: Deuce STM - CMP'09

public void update( double value, Context c){ c.beforeRead( this, commission__ADDRESS); double tmp = c.onRead( this, commission,

commission__ADDRESS); c.onWrite( this, tmp + value, commission__ADDRESS);}

Implementation

Page 32: Deuce STM - CMP'09

public void update( double value){ Context context = ContextDelegetor.getContext(); for( int i = retries ; i > 0 ; --i){ context.init(); try{ update( value, context); if( context.commit()) return; }catch ( TransactionException e ){ context.rollback(); continue; }catch ( Throwable t ){ if( context.commit()) throw t; } } throw new TransactionException();}

Implementation

Page 33: Deuce STM - CMP'09

public interface Context{

void init ( int atomicBlockId)boolean commit();void rollback ();

void beforeReadAccess( Object obj , long field );Object onReadAccess( Object obj, Object value , long field );int onReadAccess( Object obj, int value , long field );long onReadAccess( Object obj, long value , long field );…void onWriteAccess( Object obj , Object value , long field );void onWriteAccess( Object obj , int value , long field );void onWriteAccess( Object obj , long value , long field );…

}

Implementation

Page 34: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 35: Deuce STM - CMP'09

TL2 (Transaction Locking II)Dave Dice, Ori Shalev and Nir Shavit [DISC06]

CTL - Commit-time locking• Start

o Sample global version-clock• Run through a speculative execution

o Collect write-set & read-set• End

o Lock the write-seto Increment global version-clocko Validate the read-seto Commit and release the locks

Page 36: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 37: Deuce STM - CMP'09

LSA (Lazy Snapshot Algorithm)Torvald Riegel, Pascal Felber and Christof Fetzer [DISC06]

ETL - Encounter-time locking• Start

o Sample global version-clock• Run through a speculative execution

o Lock on write accesso Collect read-set & write-set

• On validation error try to extend snapshot• End

o Increment global version-clocko Validate the read-seto Commit and release the locks

Page 38: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 39: Deuce STM - CMP'09

Benchmarks (Azul – Vega2 – 2 x 46)

Page 40: Deuce STM - CMP'09

Benchmarks (SuperMicro – 2 x Quad Intel)

Page 41: Deuce STM - CMP'09

Benchmarks (Sun UltraSPARC T2 Plus – 2 x Quad x 8HT)

Page 42: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 43: Deuce STM - CMP'09

Summary

• Simple APIo @Atomic

• No changes to Javao No reserved words

• OpenSourceo On Google code

• Shows nice scalabiltyo Field based

Page 44: Deuce STM - CMP'09

Outline

• Motivation• Solutions• Deuce• Implementation• TL2• LSA• Benchmarks• Summary• References

Page 45: Deuce STM - CMP'09

References

• Homepage - http://www.deucestm.org

• Project - http://code.google.com/p/deuce/

• Wikipedia -http://en.wikipedia.org/wiki/Software_transactional_memory

• TL2 – http://research.sun.com/scalable

• LSA-STM - http://tmware.org/lsastm