csc 536 lecture 4. outline distributed transactions stm (software transactional memory) scalastm...

82
CSC 536 Lecture 4

Upload: eleanor-wilkerson

Post on 31-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

CSC 536 Lecture 4

Page 2: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Outline

Distributed transactionsSTM (Software Transactional Memory)

ScalaSTM

ConsistencyDefining consistency models

Data centric, Client centric

Implementing consistencyReplica management, Consistency Protocols

Page 3: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Distributed Transactions

Page 4: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Distributed transactions

Transactions, like mutual exclusion, protect shared data against simultaneous access by several concurrent processes.

Transactions allow a process to access and modify multiple data items as a single atomic transaction.

If the process backs out halfway during the transaction, everything is restored to the point just before the transaction started.

Page 5: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Distributed transactions: example 1

A customer dials into her bank web account and does the following:

Withdraws amount x from account 1.Deposits amount x to account 2.

If telephone connection is broken after the first step but before the second, what happens?

Either both or neither should be completed.Requires special primitives provided by the DS.

Page 6: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

The Transaction Model

Examples of primitives for transactions

Write data to a file, a table, or otherwiseWRITE

Read data from a file, a table, or otherwiseREAD

Kill the transaction and restore the old valuesABORT_TRANSACTION

Terminate the transaction and try to commitEND_TRANSACTION

Make the start of a transactionBEGIN_TRANSACTION

DescriptionPrimitive

Page 7: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Distributed transactions: example 2

a) Transaction to reserve three flights commitsb) Transaction aborts when third flight is unavailable

BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full =>ABORT_TRANSACTION (b)

BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi;END_TRANSACTION

(a)

Page 8: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ACID

Transactions areAtomic: to the outside world, the transaction happens indivisibly.

Consistent: the transaction does not violate system invariants.

Isolated (or serializable): concurrent transactions do not interfere with each other.

Durable: once a transaction commits, the changes are permanent.

Page 9: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Flat, nested and distributed transactions

a) A nested transactionb) A distributed transaction

Page 10: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Implementation of distributed transactions

For simplicity, we consider transactions on a file system.

Note that if each process executing a transaction just updates the file in place, transactions will not be atomic, and changes will not vanish if the transaction aborts.

Other methods required.

Page 11: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Atomicity

If each process executing a transaction just updates the file in place, transactions will not be atomic, and changes will vanish if the transaction aborts.

Page 12: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Solution 1: Private Workspace

a) The file index and disk blocks for a three-block fileb) The situation after a transaction has modified block 0 and appended block 3c) After committing

Page 13: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Solution 2: Writeahead Log

(a) A transaction(b) – (d) The log before each statement is executed

Log

[x = 0 / 1]

[y = 0/2]

[x = 0/4]

(d)

Log

[x = 0 / 1]

[y = 0/2]

(c)

Log

[x = 0 / 1]

(b)

x = 0;

y = 0;

BEGIN_TRANSACTION;

x = x + 1;

y = y + 2

x = y * y;

END_TRANSACTION;

(a)

Page 14: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Concurrency control (1)

We just learned how to achieve atomicity; we will learn about durability when discussing fault tolerance

Need to handle consistency and isolation

Concurrency control allows several transactions to be executed simultaneously, while making sure that the data is left in a consistent state

This is done by scheduling operations on data in an order whereby the final result is the same as if all transactions had run sequentially

Page 15: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Concurrency control (2)

General organization of managers for handling transactions

Page 16: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Concurrency control (3)General organization of managers for handling distributed transactions.

Page 17: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Serializability

The main issue in concurrency control is the scheduling of conflicting operations (operating on same data item and one of which is a write operation)

Read/Write operations can be synchronized using:Mutual exclusion mechanisms, orScheduling using timestamps

Pessimistic/optimistic concurrency control

Page 18: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

The lost update problem

Transaction T :

balance = b.getBalance();b.setBalance(balance*1.1);a.withdraw(balance/10)

Transaction U:

balance = b.getBalance();b.setBalance(balance*1.1);c.withdraw(balance/10)

balance = b.getBalance(); $200

balance = b.getBalance(); $200

b.setBalance(balance*1.1); $220

b.setBalance(balance*1.1); $220

a.withdraw(balance/10) $80

c.withdraw(balance/10) $280

Accounts a, b, and c start with $100, $200, and $300, respectively

Page 19: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

The inconsistent retrievals problem

Transaction V:

a.withdraw(100)b.deposit(100)

Transaction W:

aBranch.branchTotal()

a.withdraw(100); $100

total = a.getBalance() $100

total = total+b.getBalance() $300

total = total+c.getBalance()

b.deposit(100) $300

Accounts a and b start with $200 each.

Page 20: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

A serialized interleaving of T and U

Transaction T:

balance = b.getBalance()b.setBalance(balance*1.1)a.withdraw(balance/10)

Transaction U:

balance = b.getBalance()b.setBalance(balance*1.1)c.withdraw(balance/10)

balance = b.getBalance() $200

b.setBalance(balance*1.1) $220balance = b.getBalance() $220

b.setBalance(balance*1.1) $242

a.withdraw(balance/10) $80 c.withdraw(balance/10) $278

Page 21: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

A serialized interleaving of V and W

Transaction V: a.withdraw(100);b.deposit(100)

Transaction W:

aBranch.branchTotal()

a.withdraw(100); $100

b.deposit(100) $300

total = a.getBalance() $100

total = total+b.getBalance() $400

total = total+c.getBalance()...

Page 22: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Read and write operation conflict rules

Operations of differenttransactions

Conflict Reason

read read No Because the effect of a pair of read operationsdoes not depend on the order in which they areexecuted

read write Yes Because the effect of a read and a write operationdepends on the order of their execution

write write Yes Because the effect of a pair of write operationsdepends on the order of their execution

Page 23: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Serializability

Two transactions are serialized

if and only if

All pairs of conflicting operations of the two transactions are executed in the same order at all

objects they both access.

Page 24: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

A non-serialized interleaving of operations of transactions T and U

Transaction T: Transaction U:

x = read(i)

write(i, 10)y = read(j)

write(j, 30)

write(j, 20)z = read (i)

Page 25: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Recoverability of aborts

Aborted transactions must be prevented from affecting other concurrent transactions

Dirty readsCascading abortsPremature writes

Page 26: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

A dirty read when transaction T aborts

Transaction T:

a.getBalance()a.setBalance(balance + 10)

Transaction U:

a.getBalance()a.setBalance(balance + 20)

balance = a.getBalance() $100

a.setBalance(balance + 10) $110

balance = a.getBalance() $110

a.setBalance(balance + 20) $130

commit transaction

abort transaction

Page 27: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Cascading aborts

Suppose:U delays committing until concurrent transaction T decides whether to commit or abortTransaction V has seen the effects due to transaction UT decides to abort

Page 28: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Cascading aborts

Suppose:U delays committing until concurrent transaction T decides whether to commit or abortTransaction V has seen the effects due to transaction UT decides to abort

V and U must abort

Page 29: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Overwriting uncommitted values

Transaction T:

a.setBalance(105)

Transaction U:

a.setBalance(110)

$100

a.setBalance(105) $105

a.setBalance(110) $110

Page 30: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Transactions T and U with locksTransaction T: balance = b.getBalance()b.setBalance(bal*1.1)a.withdraw(bal/10)

Transaction U:

balance = b.getBalance()b.setBalance(bal*1.1)c.withdraw(bal/10)

Operations Locks Operations Locks

openTransactionbal = b.getBalance() lock B

b.setBalance(bal*1.1) openTransaction

a.withdraw(bal/10) lock A bal = b.getBalance() waits for T’slock on B

closeTransaction unlock A, B lock B

b.setBalance(bal*1.1)

c.withdraw(bal/10) lock C

closeTransaction unlock B, C

Page 31: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Two-phase locking (2)

Idea: the scheduler grants locks in a way that creates only serializable schedules.

In 2-phase-locking, the transaction acquires all the locks it needs in the first phase, and then releases them in the second. This will insure a serializable schedule.

Dirty reads, cascading aborts, premature writes are still possible

Page 32: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Two-phase locking (2)

Idea: the scheduler grants locks in a way that creates only serializable schedules.

In 2-phase-locking, the transaction acquires all the locks it needs in the first phase, and then releases them in the second. This will insure a serializable schedule.

Dirty reads, cascading aborts, premature writes are still possible

Under strict 2-phase locking, a transaction that needs to read or write an object must be delayed until other transactions that wrote the same object have committed or aborted

Locks are held until transaction commits or aborts

Example: CORBA Concurrency Control Service

Page 33: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Two-phase locking in a distributed system

The data is assumed to be distributed across multiple machines

Centralized 2PL: central scheduler grants locks

Primary 2PL: local scheduler is coordinator for local data

Distributed 2PL: (data may be replicated)the local schedulers use a distributed mutual exclusion algorithm to obtain a lockThe local scheduler forwards Read/Write operations to data managers holding the replicas

Page 34: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Two-phase locking issues

Exclusive locks reduce concurrency more than necessary. It is sometimes preferable to allow concurrent transactions to read an object; two types of locks may be needed (read locks and write locks)

Deadlocks are possible.Solution 1: acquire all locks in the same order.Solution 2: use a graph to detect potential deadlocks.

Page 35: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Deadlock with write locks

Transaction T Transaction U

Operations Locks Operations Locks

a.deposit(100); write lock A

b.deposit(200) write lock B

b.withdraw(100)waits for U’s a.withdraw(200); waits for T’s

lock on B lock on A

Page 36: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

The wait-for graph

B

A

Waits for

Held by

Held by

T UU T

Waits for

Page 37: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Deadlock prevention with timeouts

Transaction T Transaction U

Operations Locks Operations Locks

a.deposit(100); write lock A

b.deposit(200) write lock B

b.withdraw(100)

waits for U’s a.withdraw(200); waits for T’s

lock on B lock on A (timeout elapses)

T’s lock on A becomes vulnerable, unlock A, abort T

a.withdraw(200); write locks Aunlock A, B

Page 38: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Disadvantages of locking

High overhead

Deadlocks

Locks cannot be released until the end of the transaction, which reduces concurrency

In most applications, the likelihood of two clients accessing the same object is low

Page 39: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Pessimistic timestamp concurrency control

A transaction’s request to write an object is valid only if that object was last read and written by an earlier transaction

A transaction’s request to read an object is valid only if that object was last written by an earlier transaction

Advantage: Non-blocking and deadlock-free

Disadvantage: Transactions may need to abort and restart

Page 40: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Operation conflicts for timestamp ordering

Rule Tc Ti

1. write read Tc must not write an object that has been read by any Ti where this requires that Tc≥ the maximum read timestamp of the object.

2. write write Tc must not write an object that has been written by any Ti where

Ti >Tc

this requires that Tc> write timestamp of the committedobject.

3. read write Tc must not read an object that has been written by any Ti where this requires that Tc > write timestamp of the committed object.

Ti >Tc

Ti >Tc

Page 41: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Pessimistic Timestamp Ordering

Concurrency control using timestamps.

Page 42: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Optimistic timestamp ordering

Idea: just go ahead and do the operations without paying attention to what concurrent transactions are doing:

Keep track of when each data item has been read and written.Before committing, check whether any item has been changed since the transaction started. If so, abort. If not, commit.

Advantage: deadlock free and fast.Disadvatange: it can fail and transactions must be run again.Example: ScalaSTM

Page 43: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Software Transactional Memory (STM)

Page 44: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Software Transactional Memory (STM)

Software transactional memory is a mediator that sits between a critical section of your code (the atomic block) and the program’s heap.

The STM intervenes during reads and writes in the atomic block, allowing it to check and/or avoid interference other threads.

Page 45: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Software Transactional Memory (STM)

STM uses optimistic concurrency control to coordinate thread-safe access to shared data structures

replaces the traditional approach of using locks

Assumes that atomic blocks will run concurrently without conflict

If reads and writes by multiple threads have gotten interleaved incorrectly then all of the writes of the atomic block are rolled back and the entire block is retriedIf reads and writes are not interleaved, then it is as if they were done atomically and the atomic block can be committed

Other threads or actors can only see committed changes

Keeps old versions of data so that you can back up

Page 46: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM

ScalaSTM is an implementation of STM for ScalaIt manages only memory locations encapsulated in instances of mutable class Ref[A]

A is an immutable typeRef-s ensure that fewer memory locations need to be managedChanges to Ref-s values make use of Scala’s efficient immutable data structures Allows atomic blocks to be expressed directly in ScalaNo synchronized, no deadlocks or race conditions, and good scalabilityIncludes concurrent sets and maps and an easier and safer replacement for wait and notifyAll

Page 47: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM first example

val (x, y) = (Ref(10), Ref(0))

def sum = atomic { implicit txn => val a = x() val b = y() a + b}

def transfer(n: Int) { atomic { implicit txn => x() -= n y() += n }}

Use a Ref for each shared variable to get STM involved

Use atomic for each critical section

atomic is a function with implicit parameter of type InTxn

Page 48: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM first example// sum // transfer(2)atomic atomic| begin txn attempt | begin txn attempt| | read x -> 10 | | read x -> 10| | : | | write x <- 8| | | | read y -> 0| | : | | write y <- 2| | | commit| | read y -> x read is invalid +-> ()| roll back| begin txn attempt| | read x -> 8| | read y -> 2| commit+-> 10

When sum tries to read y, STM detects that the value previously read from x is no longer correct

On the second attempt sum succeeds

Page 49: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM example: ConcurrentIntList

In shared, mutable linked list, need thread-safety for each node’s prev and next references

Use a Ref for each reference to get STM involved

Ref is a single mutable cell

import scala.concurrent.stm._

class ConcurrentIntList { private class Node(val elem: Int, prev0: Node, next0: Node) { val isHeader = prev0 == null val prev = Ref(if (isHeader) this else prev0) val next = Ref(if (isHeader) this else next0) }

private val header = new Node(-1, null, null)

Page 50: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM example: ConcurrentIntList

Appending a new node involves reads/writes of several references that should be done atomically

If x is a Ref, x() gets the value stored in x, and x() = val sets it to val

Ref-s can only be read and written inside an atomic block

def addLast(elem: Int) { atomic { implicit txn => val p = header.prev() val newNode = new Node(elem, p, header) p.next() = newNode header.prev() = newNode } }

Page 51: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM example: ConcurrentIntList

Ref-s can only be read and written inside an atomic blockThis is checked at compile time by requiring that an implicit InTxn value be available. Atomic blocks are functions that take an InTxn parameter, so this requirement can be satisfied by marking the parameter as implicit.You create a new atomic block using

atomic { implicit t =>

// the body

}

The presence of an implicit InTxn instance grants the caller permission to perform transactional reads and writes on Ref-s

Page 52: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM example: ConcurrentIntList

Ref.single returns an instance of Ref.View, which acts just like the original Ref except that it can also be accessed outside an atomic block.

Each method on Ref.View acts like a single-operation transaction

If an atomic block only accesses a single Ref, it is more concise and more efficient to use a Ref.View.

//def isEmpty = atomic { implicit t => header.next() == header } def isEmpty = header.next.single() == header

Page 53: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

ScalaSTM example: ConcurrentIntList

retry is used when an atomic block can’t complete on its current input state

Calling retry inside an atomic block will cause it to roll back, wait for one of its inputs to change, and then retry execution.

STM keeps track of an atomic block’s read set, the set of Ref-s that have been read during the transaction

STM can block the current thread until another thread has written to an element of its read set, at which time the atomic block can be retried

def removeFirst(): Int = atomic { implicit txn => val n = header.next() if (n == header) retry val nn = n.next() header.next() = nn nn.prev() = header n.elem }

Page 54: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

STM Pros

IntuitiveYou designate a code block atomic and it executes atomically without deadlocks or races; no need for locks

Readers scaleAll of the threads in a system can read data without interfering with each other.

Exceptions automatically trigger cleanupIf an atomic block throws an exception, all of the Ref-s are reset to their original state

Waiting for complex conditions is easyIf an atomic block doesn’t find the state it’s looking for, it can call retry to back up and wait for any of its inputs to change.

Page 55: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Cons

Two extra characters per read or write.If x is a Ref, then x() reads its value and x() = v writes it.

Single-thread overheadIn most cases STMs are slower when the program isn’t actually running in parallel

Rollback doesn’t mix well with I/OOnly changes to Ref-s are undone automaticallyShouldn’t really be doing I/O inside a critical section

Page 56: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Replication and consistency

Page 57: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Replication and consistency

ReplicationReliabilityPerformance

Multiple replicas leads to consistency problems.

Keeping the copies the same can be expensive.

Page 58: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Replication and scalingWe focus for now on replication as a scaling technique

Placing copies close to processes using them reduces the load on the network

But, replication can be expensiveKeeping copies up to date puts more load on the network

Example: distributed totally-ordered multicastingExample: central coordinator

The only real solution is to loosen the consistency requirements

The price is that replicas may not be the sameWe will develop several consistency models

Page 59: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Data-Centric Consistency Models

The general organization of a logical data store, physically distributed and replicated across multiple processes.

Page 60: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Consistency models

A consistency model is a contract between processes and the data store.

If processes obey certain rules, i.e. behave according to the contract, then ...The data store will work “correctly”, i.e. according to the contract.

Page 61: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Strict consistency

Any read on a data item x returns a value corresponding to the most recent write on x

Definition implicitly assumes existence of absolute global timeDefinition makes sense in uniprocessor systems:a=1;a=2;print(a);

Definition makes little sense in distributed systemMachine B writes xA moment later, machine A reads xDoes A read old or new value?

Page 62: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Strict Consistency

Behaviour of two processes, operating on the same data item.a) A strictly consistent store.

b) A store that is not strictly consistent.

Page 63: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Weakening of the model

Strict consistency is ideal, but un-achievable

Must use weaker models, and that's OK!Writing programs that assume/require the strict consistency model is unwiseIf order of events is essential, use critical sections

Page 64: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential consistency

The result of any execution is the same as ifthe (read and write) operations by all processes on the data store were executed in some sequential order, andthe operations of each individual process appear in this sequence in the order specified by its program

No reference to time

All processes see the same interleaving of operations.

Page 65: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential Consistency

a) A sequentially consistent data store.b) A data store that is not sequentially consistent.

Page 66: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential Consistency

Three concurrently executing processes.

z = 1;

print (x, y);

y = 1;

print (x, z);

x = 1;

print ( y, z);

Process P3Process P2Process P1

Page 67: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential Consistency

Four valid execution sequences for the processes of the previous slide. The vertical axis is time.

y = 1;

x = 1;

z = 1;

print (x, z);

print (y, z);

print (x, y);

Prints: 111111

(d)

y = 1;

z = 1;

print (x, y);

print (x, z);

x = 1;

print (y, z);

Prints: 010111

(c)

x = 1;

y = 1;

print (x,z);

print(y, z);

z = 1;

print (x, y);

Prints: 101011

(b)

x = 1;

print ((y, z);

y = 1;

print (x, z);

z = 1;

print (x, y);

Prints: 001011

(a)

Page 68: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential consistency more precisely

An execution E of processor P is a sequence of R/W operations by P on a data store.

This sequence must agree with the P's program order .

A history H is an ordering of all R/W operations that is consistent with the execution of each processor.

In a sequentially consistent model, the history H must obey the following rules:

Program order (of each process) must be maintained.Data coherence must be maintained.

Page 69: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential consistency is expensive

Suppose A is any algorithm that implements a sequentially consistent data store

Let r be the time it takes A to read a data item xLet w be the time it takes A to write to a data item xLet t be the message time delay between any two nodes

Then r + w > ??

Page 70: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Sequential consistency is expensive

Suppose A is any algorithm that implements a sequentially consistent data store

Let r be the time it takes A to read a data item xLet w be the time it takes A to write to a data item xLet t be the message time delay between any two nodes

Then r + w > t

If lots of reads and lots of writes, we need weaker consistency models.

Page 71: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Causal consistency

Sequential consistency may be too strict: it enforces that every pair of events is seen by everyone in the same order, even if the two events are not causally related

Causally related: P1 writes to x; then P2 reads x and writes to yNot causally related: P1 writes to x and, concurrently, P2 writes to x

Reminder: operations that are not causally related are said to be concurrent.

Page 72: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Causal consistency

Writes that are potentially causally related must be seen by all processes in the same order.Concurrent writes may be seen in a different order on different machines.

Page 73: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Causal Consistency

This sequence is allowed with a causally-consistent store, but not with sequentially or strictly consistent store.

Page 74: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Causal Consistency

a) A violation of a causally-consistent store.b) A correct sequence of events in a causally-consistent store.

Page 75: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

FIFO Consistency

Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes.

Page 76: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

FIFO Consistency

A valid sequence of events of FIFO consistency

Page 77: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Weak consistencySequential and causal consistency defined at the level of read and write operations

The granularity is fine

Concurrency is typically controlled through synchronization mechanisms such as mutual exclusion or transactions

A sequence of reads/writes is done in an atomic block and their order is not important to other processesThe consistency contract should be at the level of atomic blocks

granularity should be coarser

Associate synchronization of data store with a synchronization variable

when the data store is synchronized, all local writes by process P are propagated to all other copies, whereas writes by other processes are brought in to P's copy

Page 78: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Weak Consistency

a) A valid sequence of events for weak consistency.b) An invalid sequence for weak consistency.

Page 79: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Entry consistencyNot one but two synchronization methods

acquire and release

When a lock on a shared variable is being acquired, the variable must be brought up to date

All remote writes must be made visible to process acquiring the lock

Before updating a shared variable, the process must enter its atomic block (critical section) ensuring exclusive access

After a process has completed the updates within an atomic block, the updates are guaranteed to be visible to another process only if it acquires the lock on the variable

Page 80: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Entry consistency

A valid event sequence for entry consistency.

Page 81: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Release consistency

When a lock on a shared variable is being released, the updated variable value is sent to shared memory

To view the value of a shared variable as guaranteed by the contract, an acquire is required

Page 82: CSC 536 Lecture 4. Outline Distributed transactions STM (Software Transactional Memory) ScalaSTM Consistency Defining consistency models Data centric,

Release consistency example: JVM

The Java Virtual Machine uses the SMP model of parallel computation.

Each thread will create local copies of shared variables Calling a synchronized method is equivalent to an acquire exiting a synchronized method is equivalent to a release

A release has the effect of flushing the cache to main memory

writes made by this thread can be visible to other threads

An acquire has the effect of invalidating the local cache so that variables will be reloaded from main memory

Writes made by the previous release are made visible