chapter 11 - db.inf.uni-tuebingen.de · chapter 11 transaction management concurrent and consistent...

79
Transaction Management Torsten Grust ACID Properties Anomalies The Scheduler Serializability Query Scheduling Locking Two-Phase Locking Optimistic Concurrency Protocol Multi-Version Concurrency Control 1 Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten Grust Wilhelm-Schickard-Institut für Informatik Universität Tübingen

Upload: others

Post on 28-Oct-2019

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

1

Chapter 11Transaction ManagementConcurrent and Consistent Data Access

Architecture and Implementation of Database SystemsSummer 2014

Torsten GrustWilhelm-Schickard-Institut für Informatik

Universität Tübingen

Page 2: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

2

The “Hello World” of Transaction Management

• My bank issued me a debit card to access my account.

• Every once in a while, I’d use it at an ATM to draw somemoney from my account, causing the ATM to perform atransaction in the bank’s database.

Example (ATM transaction)

1 bal ← read_bal (acct_no) ;2 bal ← bal − 100¤ ;3 write_bal (acct_no, bal) ;

• My account is properly updated to reflect the new balance.

Page 3: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

3

Concurrent Access

The problem is: My wife has a card for the very same account, too.

⇒ We might end up using our cards at different ATMs at thesame time, i.e., concurrently.

Example (Concurrent ATM transactions)

me

my wife DB state

bal ← read (acct) ;

1200bal ← read (acct) ; 1200

bal ← bal − 100 ;

1200bal ← bal − 200 ; 1200

write (acct, bal) ;

1100write (acct, bal) ; 1000

• The first update was lost during this execution. Lucky me!

Page 4: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

3

Concurrent Access

The problem is: My wife has a card for the very same account, too.

⇒ We might end up using our cards at different ATMs at thesame time, i.e., concurrently.

Example (Concurrent ATM transactions)

me my wife

DB state

bal ← read (acct) ;

1200

bal ← read (acct) ;

1200

bal ← bal − 100 ;

1200

bal ← bal − 200 ;

1200

write (acct, bal) ;

1100

write (acct, bal) ;

1000

• The first update was lost during this execution. Lucky me!

Page 5: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

3

Concurrent Access

The problem is: My wife has a card for the very same account, too.

⇒ We might end up using our cards at different ATMs at thesame time, i.e., concurrently.

Example (Concurrent ATM transactions)

me my wife DB state

bal ← read (acct) ; 1200bal ← read (acct) ; 1200

bal ← bal − 100 ; 1200bal ← bal − 200 ; 1200

write (acct, bal) ; 1100write (acct, bal) ; 1000

• The first update was lost during this execution. Lucky me!

Page 6: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

4

If the Plug is Pulled . . .

• This time, I want to transfer money over to another account.

Example (Money transfer transaction)

// Subtract money from source (checking) account

1 chk_bal ← read_bal (chk_acct_no) ;2 chk_bal ← chk_bal − 500¤ ;3 write_bal (chk_acct_no, chk_bal) ;

// Credit money to the target (savings) account

4 sav_bal ← read_bal (sav_acct_no) ;5 sav_bal ← sav_bal + 500¤ ;6 7 write_bal (sav_acct_no, sav_bal) ;

• Before the transaction gets to step 7, its execution isinterrupted/cancelled (power outage, disk failure, softwarebug, . . . ). My money is lost /.

Page 7: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

5

ACID Properties

To prevent these (and many other) effects from happening, aDBMS guarantees the following transaction properties:

AtomicityA Either all or none of the updates in a databasetransaction are applied.

ConsistencyC Every transaction brings the database from oneconsistent state to another. (While the transactionexecutes, the database state may be temporarilyinconsistent.)

IsolationI A transaction must not see any effect from othertransactions that run in parallel.

DurabilityD The effects of a successful transaction remainpersistent and may not be undone for systemreasons.

Page 8: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

6

Concurrency Control

data files, indices, . . .

Disk Space Manager

Buffer Manager

Files and Access Methods

Operator Evaluator Optimizer

Executor Parser

RecoveryManager

Lock Manager

TransactionManager

DBMS

Database

SQL Commands

Web Forms Applications SQL Interface

Page 9: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

7

Anomalies: Lost Update

• We already saw an example of the lost update anomaly onslide 3:

The effects of one transaction are lost due to anuncontrolled overwrite performed by the secondtransaction.

Page 10: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

8

Anomalies: Inconsistent Read

Reconsider the money transfer example (slide 4), expressed inSQL syntax:

Example

Transaction 1 Transaction 21 UPDATE Accounts2 SET balance = balance - 5003 WHERE customer = 19044 AND account_type = ’C’;

1 SELECT SUM(balance)2 FROM Accounts3 WHERE customer = 1904;

5 UPDATE Accounts6 SET balance = balance + 5007 WHERE customer = 19048 AND account_type = ’S’;

• Transaction 2 sees a temporary, inconsistent database state.

Page 11: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

9

Anomalies: Dirty Read

At a different day, my wife and me again end up in front of anATM at roughly the same time. This time, my transaction iscancelled (aborted):

Example

me my wife DB state

bal ← read (acct) ; 1200bal ← bal − 100 ; 1200write (acct, bal) ; 1100

bal ← read (acct) ; 1100bal ← bal − 200 ; 1100

abort ; 1200write (acct, bal) ; 900

• My wife’s transaction has already read the modified accountbalance before my transaction was rolled back (i.e., itseffects are undone).

Page 12: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

9

Anomalies: Dirty Read

At a different day, my wife and me again end up in front of anATM at roughly the same time. This time, my transaction iscancelled (aborted):

Example

me my wife DB state

bal ← read (acct) ; 1200bal ← bal − 100 ; 1200write (acct, bal) ; 1100

bal ← read (acct) ; 1100bal ← bal − 200 ; 1100

abort ; 1200

write (acct, bal) ; 900

• My wife’s transaction has already read the modified accountbalance before my transaction was rolled back (i.e., itseffects are undone).

Page 13: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

9

Anomalies: Dirty Read

At a different day, my wife and me again end up in front of anATM at roughly the same time. This time, my transaction iscancelled (aborted):

Example

me my wife DB state

bal ← read (acct) ; 1200bal ← bal − 100 ; 1200write (acct, bal) ; 1100

bal ← read (acct) ; 1100bal ← bal − 200 ; 1100

abort ; 1200write (acct, bal) ; 900

• My wife’s transaction has already read the modified accountbalance before my transaction was rolled back (i.e., itseffects are undone).

Page 14: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

10

Concurrent Execution

• The scheduler decides the execution order of concurrentdatabase accesses.

The transaction scheduler

Client 1 Client 2 Client 3

Scheduler

Access and Storage Layer

32

1

2

1

32

1

1

2

1

1

Page 15: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

11

Database Objects and Accesses

• We now assume a slightly simplified model of databaseaccess:

1 A database consists of a number of named objects. In agiven database state, each object has a value.

2 Transactions access an object o using the twooperations read o and write o.

• In a relational DBMS we have that

object ≡ attribute .

This defines the granularity of our discussion. Otherpossible granularities:

object ≡ row, object ≡ table .

Page 16: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

12

Transactions

Database transaction

A database transaction T is a (strictly ordered) sequence ofsteps. Each step is a pair of an access operation applied to anobject.

• Transaction T = 〈s1, . . . , sn〉• Step si = (ai, ei)

• Access operation ai ∈ {r(ead), w(rite)}The length of a transaction T is its number of steps |T| = n.

We could write the money transfer transaction as

T = 〈 (read, Checking), (write, Checking),(read, Saving), (write, Saving) 〉

32

1

or, more concisely,

T = 〈r(C),w(C), r(S),w(S)〉 .

Page 17: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

13

Schedules

Schedule

A schedule S for a given set of transactions T = {T1, . . . , Tn} is anarbitrary sequence of execution steps

S(k) = (Tj, ai, ei) k = 1 . . .m ,2

1

1

such that

1 S contains all steps of all transactions and nothing else and

2 the order among steps in each transaction Tj is preserved:

(ap, ep) < (aq, eq) in Tj ⇒ (Tj, ap, ep) < (Tj, aq, eq) in S

(read “<” as: occurs before).

We sometimes writeS = 〈r1(B), r2(B),w1(B),w2(B)〉

to abbreviateS(1) = (T1, read, B) S(3) = (T1, write, B)S(2) = (T2, read, B) S(4) = (T2, write, B)

Page 18: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

14

Serial Execution

Serial execution

One particular schedule is serial execution.

• A schedule S is serial iff, for each contained transaction Tj, allits steps are adjacent (no interleaving of transactions andthus no concurrency).

Briefly:

S = Tπ1, Tπ2, . . . , Tπn (for some permutation π of 1, . . . , n)

Consider again the ATM example from slide 3.

• S = 〈r1(B), r2(B),w1(B),w2(B)〉• This is a schedule, but it is not serial.

2

2

1

1

If my wife had gone to the bank one hour later (initiatingtransaction T2), the schedule probably would have been serial.

• S = 〈r1(B),w1(B), r2(B),w2(B)〉2

1

2

1

Page 19: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

15

Correctness of Serial Execution

• Anomalies such as the “lost update” problem on slide 3 canonly occur in multi-user mode.

• If all transactions were fully executed one after another (noconcurrency), no anomalies would occur.

⇒ Any serial execution is correct.

• Disallowing concurrent access, however, is not practical.

Correctness criterion

Allow concurrent executions if their overall effect is equivalentto an (arbitrary) serial execution.

Page 20: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

16

Conflicts

What does it mean for a schedule S to be equivalent to anotherschedule S′?

• Sometimes, we may be able to reorder steps in a schedule.• We must not change the order among steps of any

transaction Tj (↗ slide 13).• Rearranging operations must not lead to a different

result.

• Two operations (Ti, a, e) and (Tj, a′, e′) are said to be inconflict (Ti, a, e) = (Tj, a′, e′) if their order of executionmatters.

• When reordering a schedule, we must not change therelative order of such operations.

• Any schedule S′ that can be obtained this way from S is saidto be conflict equivalent to S.

Page 21: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

17

Conflicts

Based on our read/write model, we can come up with a moremachine-friendly definition of a conflict.

Conflicting operations

Two operations (Ti, a, e) and (Tj, a′, e′) are in conflict (=) in S if

1 they belong to two different transactions (Ti 6= Tj), and

2 they access the same database object, i.e., e = e′, and

3 at least one of them is a write operation.

• This inspires the following conflict matrix:read write

read ×write × ×

• Conflict relation ≺S:

(Ti, a, e) ≺S (Tj, a′, e′):=

(Ti, a, e) = (Tj, a′, e′) ∧ (Ti, a, e) occurs before (Tj, a′, e′) in S

Page 22: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

18

Conflict Serializability

Conflict serializability

A schedule S is conflict serializable iff it is conflict equivalent tosome serial schedule S′.

• The execution of a conflict-serializable S schedule iscorrect.

• Note: S does not have to be a serial schedule.�

Page 23: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

19

Serializability: Example

Example (Three schedules Si for two transactions T1,2, with S2 serial)

Schedule S1

T1 T2

read Awrite A

read Awrite A

read Bwrite B

read Bwrite B

Schedule S2

T1 T2

read Awrite Aread Bwrite B

read Awrite Aread Bwrite B

Schedule S3

T1 T2

read Awrite A

read Awrite Aread Bwrite B

read Bwrite B

• Conflict relations:

(T1, r, A) ≺S1 (T2, w, A), (T1, r, B) ≺S1 (T2, w, B),(T1, w, A) ≺S1 (T2, r, A), (T1, w, B) ≺S1 (T2, r, B),(T1, w, A) ≺S1 (T2, w, A), (T1, w, B) ≺S1 (T2, w, B)

(Note: ≺S2 = ≺S1 )

(T1, r, A) ≺S3 (T2, w, A), (T2, r, B) ≺S3 (T1, w, B),(T1, w, A) ≺S3 (T2, r, A), (T2, w, B) ≺S3 (T1, r, B),(T1, w, A) ≺S3 (T2, w, A), (T2, w, B) ≺S3 (T1, w, B)

⇒ S1serializable

Page 24: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

20

The Conflict Graph

• The serializability idea comes with an effective test for thecorrectness of a schedule S based on its conflict graph G(S)(also: serialization graph):

• The nodes of G(S) are all transactions Ti in S.• There is an edge Ti → Tj iff S contains operations(Ti, a, e) and (Tj, a′, e′) such that (Ti, a, e) ≺S (Tj, a′, e′)

(read: in a conflict equivalent serial schedule, Ti mustoccur before Tj).

• S is conflict serializable iff G(S) is acyclic.

An equivalent serial schedule for S may be immediatelyobtained by sorting G(S) topologically.

Page 25: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

21

Serialization Graph

Example (ATM transactions (↗ slide 3))

• S = 〈r1(A), r2(A),w1(A),w2(A)〉• Conflict relation:(T1, r,A) ≺S (T2, w,A)(T2, r,A) ≺S (T1, w,A)(T1, w,A) ≺S (T2, w,A)

T1

T2

⇒ not serializable

Example (Two money transfers (↗ slide 4))

• S = 〈r1(C),w1(C), r2(C),w2(C), r1(S),w1(S), r2(S),w2(S)〉• Conflict relation:(T1, r, C) ≺S (T2, w, C)(T1, w, C) ≺S (T2, r, C)(T1, w, C) ≺S (T2, w, C)

...

T1

T2

⇒ serializable

Page 26: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

21

Serialization Graph

Example (ATM transactions (↗ slide 3))

• S = 〈r1(A), r2(A),w1(A),w2(A)〉• Conflict relation:(T1, r,A) ≺S (T2, w,A)(T2, r,A) ≺S (T1, w,A)(T1, w,A) ≺S (T2, w,A)

T1

T2

⇒ not serializable

Example (Two money transfers (↗ slide 4))

• S = 〈r1(C),w1(C), r2(C),w2(C), r1(S),w1(S), r2(S),w2(S)〉• Conflict relation:(T1, r, C) ≺S (T2, w, C)(T1, w, C) ≺S (T2, r, C)(T1, w, C) ≺S (T2, w, C)

...

T1

T2

⇒ serializable

Page 27: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

21

Serialization Graph

Example (ATM transactions (↗ slide 3))

• S = 〈r1(A), r2(A),w1(A),w2(A)〉• Conflict relation:(T1, r,A) ≺S (T2, w,A)(T2, r,A) ≺S (T1, w,A)(T1, w,A) ≺S (T2, w,A)

T1

T2

⇒ not serializable

Example (Two money transfers (↗ slide 4))

• S = 〈r1(C),w1(C), r2(C),w2(C), r1(S),w1(S), r2(S),w2(S)〉• Conflict relation:(T1, r, C) ≺S (T2, w, C)(T1, w, C) ≺S (T2, r, C)(T1, w, C) ≺S (T2, w, C)

...

T1

T2

⇒ serializable

Page 28: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

22

Query Scheduling

Can we build a scheduler that always emits a serializableschedule?

Idea:

• Require each transaction toobtain a lock before itaccesses a data object o:

Locking and unlocking of o

1 lock o ;2 . . . access o . . . ;3 unlock o ;

• This prevents concurrentaccess to o.

Client 1 Client 2 Client 3

Scheduler

Access and Storage Layer

32

1

2

1

32

1

2

1

1

Page 29: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

23

Locking

• If a lock cannot be granted (e.g., because another transactionT ′ already holds a conflicting lock) the requestingtransaction T gets blocked.

• The scheduler suspends execution of the blockedtransaction T .

• Once T ′ releases its lock, it may be granted to T , whoseexecution is then resumed.

⇒ Since other transactions can continue execution while T isblocked, locks can be used to control the relative order ofoperations.

Page 30: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

24

Locking and Scheduling

Example (Locking and scheduling)

• Consider twotransactions T1,2:

T1

lock Awrite Alock Bunlock Awrite Bunlock B

T2

lock Awrite Alock Bwrite Bwrite Aunlock Awrite Bunlock B

• Two valid schedules (respecting lock andunlock calls) are:

Schedule S1

T1 T2

lock Awrite Alock Bunlock A

lock Awrite A

write Bunlock B

lock Bwrite Bwrite Aunlock Awrite Bunlock B

Schedule S2

T1 T2

lock Awrite Alock Bwrite Bwrite Aunlock A

lock Awrite A

write Bunlock B

lock Bwrite Bunlock Bunlock A

• Note: Both schedules S1,2 are serializable. Are we done yet?

Page 31: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

25

Locking and Serializability

Example (Proper locking does not guarantee serializability yet)

Even if we adhere to a properly nested lock/unlock discipline,the scheduler might still yield non-serializiable schedules:

Schedule S1T1 T2lock Alock Cwrite Awrite Cunlock A

lock Awrite Alock Bunlock Awrite Bunlock B

unlock Clock Bwrite Bunlock B

lock Cwrite Cunlock C

� What is the conflict graph of thisschedule?

Page 32: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

26

ATM Transaction with Locking

Example (Two concurrent ATM transactions with locking)

Transaction 1 Transaction 2 DB state

lock (acct) ;

1200read (acct) ;

unlock (acct) ;lock (acct) ;

read (acct) ;

unlock (acct) ;lock (acct) ;

write (acct) ; 1100

unlock (acct) ;lock (acct) ;

write (acct) ; 1000

unlock (acct) ;

⇒ Again: on its own, proper locking does not guaranteeserializability yet.

Page 33: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

26

ATM Transaction with Locking

Example (Two concurrent ATM transactions with locking)

Transaction 1 Transaction 2 DB state

lock (acct) ; 1200read (acct) ;unlock (acct) ;

lock (acct) ;read (acct) ;unlock (acct) ;

lock (acct) ;write (acct) ; 1100unlock (acct) ;

lock (acct) ;write (acct) ; 1000unlock (acct) ;

⇒ Again: on its own, proper locking does not guaranteeserializability yet.

Page 34: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

27

Two-Phase Locking (2PL)

The two-phase locking protocol poses an additional restrictionon how transactions have to be written:

Definition (Two-Phase Locking)

• Once a transaction has released any lock (i.e., performed thefirst unlock), it must not acquire any new lock:

lock phase release phase

lock point

# of locks held

time

• Two-phase locking is the concurrency control protocol usedin database systems today.

Page 35: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

28

Again: ATM Transaction

Example (Two concurrent ATM transactions with locking, ¬ 2PL)

Transaction 1 Transaction 2 DB state

lock (acct) ; 1200read (acct) ;unlock (acct) ;

lock (acct) ;read (acct) ;unlock (acct) ;

lock (acct) ;

write (acct) ; 1100unlock (acct) ;

lock (acct) ;

write (acct) ; 1000unlock (acct) ;

These locks violate the 2PL principle.

Page 36: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

28

Again: ATM Transaction

Example (Two concurrent ATM transactions with locking, ¬ 2PL)

Transaction 1 Transaction 2 DB state

lock (acct) ; 1200read (acct) ;unlock (acct) ;

lock (acct) ;read (acct) ;unlock (acct) ;

lock (acct) ; write (acct) ; 1100unlock (acct) ;

lock (acct) ; write (acct) ; 1000unlock (acct) ;

These locks violate the 2PL principle.

Page 37: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

29

A 2PL-Compliant ATM Transaction

• To comply with the two-phase locking protocol, the ATMtransaction must not acquire any new locks after a first lockhas been released:

A 2PL-compliant ATM withdrawal transaction

1 lock (acct) ;2 bal ← read_bal (acct) ;3 bal ← bal − 100¤ ;4 write_bal (acct, bal) ;5 unlock (acct) ;

lock phase

unlock phase

Page 38: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

30

Resulting Schedule

Example

Transaction 1 Transaction 2 DB state

lock (acct) ; 1200

read (acct) ;

lock (acct) ;

write (acct) ; 1100

unlock (acct) ;

lock (acct) ;

read (acct) ;

write (acct) ; 900

unlock (acct) ;

Transactionblocked

• Theorem: The use of 2PL-compliant locking always leads toa correct and serializable schedule or to a deadlock.

Page 39: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

30

Resulting Schedule

Example

Transaction 1 Transaction 2 DB state

lock (acct) ; 1200

read (acct) ;

lock (acct) ;

write (acct) ; 1100

unlock (acct) ;

lock (acct) ;

read (acct) ;

write (acct) ; 900

unlock (acct) ;

Transactionblocked

• Theorem: The use of 2PL-compliant locking always leads toa correct and serializable schedule or to a deadlock.

Page 40: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

31

Lock Modes

• We saw earlier that two read operations do not conflict witheach other.

• Systems typically use different types of locks (lock modes) toallow read operations to run concurrently.

• read locks or shared locks: mode S• write locks or exclusive locks: mode X

• Locks are only in conflict if at least one of them is an X lock:

Shared vs. exclusive lock compatibility

shared (S) exclusive (X)shared (S) ×

exclusive (X) × ×

• It is a safe operation in two-phase locking to (try to) converta shared lock into an exclusive lock during the lockphase (lock upgrade)⇒ improved concurrency.

Page 41: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

32

Deadlocks

• Like many lock-based protocols, two-phase locking has therisk of deadlock situations:

Example (Proper schedule with locking)

Transaction 1 Transaction 2

lock (A) ;... lock (B) ;

do something...

... do something

lock (B) ;...

[ wait for T2 to release lock ] lock (A) ;[ wait for T1 to release lock ]

• Both transactions would wait for each other indefinitely.

Page 42: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

33

Deadlock Handling

• Deadlock detection:

1 The system maintains a waits-for graph, where an edgeT1 → T2 indicates that T1 is blocked by a lock held by T2.

2 Periodically, the system tests for cycles in the graph.3 If a cycle is detected, the deadlock is resolved by

aborting one or more transactions.4 Selecting the victim is a challenge:

• Aborting young transactions may lead tostarvation: the same transaction may be cancelledagain and again.

• Aborting an old transaction may cause a lot ofcomputational investment to be thrown away (butthe undo costs may be high).

• Deadlock prevention:Define an ordering� on all database objects. If A� B,then order the lock operations in all transactions in thesame way (lock(A) before lock(B)).

Page 43: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

33

Deadlock Handling

• Deadlock detection:

1 The system maintains a waits-for graph, where an edgeT1 → T2 indicates that T1 is blocked by a lock held by T2.

2 Periodically, the system tests for cycles in the graph.3 If a cycle is detected, the deadlock is resolved by

aborting one or more transactions.4 Selecting the victim is a challenge:

• Aborting young transactions may lead tostarvation: the same transaction may be cancelledagain and again.

• Aborting an old transaction may cause a lot ofcomputational investment to be thrown away (butthe undo costs may be high).

• Deadlock prevention:Define an ordering� on all database objects. If A� B,then order the lock operations in all transactions in thesame way (lock(A) before lock(B)).

Page 44: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

34

Deadlock Handling

Other common technique:

• Deadlock detection via timeout:Let a transaction T block on a lock request only until atimeout occurs (counter expires). On expiration, assume thata deadlock has occurred and abort T .

Timeout-based deadlock detection

db2 => GET DATABASE CONFIGURATION;

.

.

.Interval for checking deadlock (ms) (DLCHKTIME) = 10000Lock timeout (sec) (LOCKTIMEOUT) = 30

.

.

.

• Also: lock-less optimistic concurrency control (↗ slide 42).

Page 45: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

35

Variants of Two-Phase Locking

• The two-phase locking discipline does not prescribe exactlywhen locks have to be acquired and released.

• Two possible variants:

Preclaiming and strict 2PL

“lock phase” release phase

locksheld

time

Preclaiming 2PL

lock phase “release phase”

locksheld

time

Strict 2PL

� What could motivate either variant?

1 Preclaiming 2PL:

2 Strict 2PL:

Page 46: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

36

Cascading Rollbacks

Consider three transactions:

Transations T1,2,3, T1 fails later on

abort ;w(x)

r(x)

r(x)

T1

T2

T3

timet2t1

• When transaction T1 aborts, transactions T2 and T3 havealready read data written by T1 (↗ dirty read, slide 9)

• T2 and T3 need to be rolled back, too (cascading roll back).

• T2 and T3 cannot commit until the fate of T1 is known.

⇒ Strict 2PL can avoid cascading roll backs altogether. (How?)

Page 47: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

37

Granularity of Locking

The granularity of locking is a trade-off:

database level

tablespace level1

table level

page level

row-level high concurrency

low concurrency

high overhead

low overhead

⇒ Idea: multi-granularity locking.

1An DB2 tablespace represents a collection of tables that share a physicalstorage location.

Page 48: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

38

Multi-Granularity Locking

• Decide the granularity of locks held for each transaction(depending on the characteristics of the transaction):

• For example, aquire a row lock for

Row-selecting query (C_CUSTKEY is key)

1 SELECT * Q12 FROM CUSTOMERS3 WHERE C_CUSTKEY = 42

and a table lock for

Table scan query

1 SELECT * Q22 FROM CUSTOMERS

• How do such transactions know about each others’ locks?

• Note that locking is performance-critical. Q2 does notwant to do an extensive search for row-level conflicts.

Page 49: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

39

Intention Locks

Databases use an additional type of locks: intention locks.

• Lock mode intention share: IS

• Lock mode intention exclusive: IX

Extended conflict matrix

S X IS IX

S × ×X × × × ×IS ×IX × ×

• A lock I� on a coarser level of granularity means that thereis some � lock on a lower level.

Page 50: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

40

Intention Locks

Multi-granularity locking protocol

1 Before a granule g can be locked in � ∈ {S, X} mode, thetransaction has to obtain an I� lock on all coarsergranularities that contain g.

2 If all intention locks could be granted, the transaction canlock granule g in the announced � mode.

Example (Multi-granularity locking)

Query Q1 (↗ slide 38) would, e.g.,

• obtain an IS lock on table CUSTOMERS

(also on the containing tablespace and database) and

• obtain an S lock on the row(s) with C_CUSTKEY = 42.

Query Q2 would place an

• S lock on table CUSTOMERS

(and an IS lock on tablespace and database).

Page 51: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

41

Detecting Conflicts

• Now suppose an updating query comes in:

Update request

1 UPDATE CUSTOMERS Q32 SET NAME = ’Seven Teen’3 WHERE C_CUSTKEY = 17

• Q3 will want to place

• an IX lock on table CUSTOMER (and all coarser granules)and

• an X lock on the row holding customer 17.

As such it is• compatible with Q1

(there is no conflict between IX and IS on the tablelevel),

• but incompatible with Q2

(the table-level S lock held by Q2 is in conflict with Q3’sIX lock request).

Page 52: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

42

Optimistic Concurrency Control

• Up to here, the approach to concurrency control has beenpessimistic:

• Assume that transactions will conflict and thus protectdatabase objects by locks and lock protocols.

• The converse is a optimistic concurrency control approach:• Hope for the best and let transactions freely proceed

with their read/write operations.• Only just before updates are to be committed to the

database, perform a check to see whether conflictsindeed did not happen.

• Rationale: Non-serializable conflicts are not that frequent.Save the locking overhead in the majority of cases andonly invest effort if really required.

Page 53: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

43

Optimistic Concurrency Control

Under optimistic concurrency control, transactions proceed inthree phases:

Optimistic concurrency control

1 Read Phase. Execute transaction, but do not write data backto disk immediately. Instead, collect updates in thetransaction’s private workspace.

2 Validation Phase. When the transaction wants to commit,test whether its execution was correct (only acceptableconflicts happened). If it is not, abort the transaction.

3 Write Phase. Transfer data from private workspace intodatabase.

• Note: Phases 2 and 3 need to be performed in anon-interruptible critical section (thus also called theval-write phase).

Page 54: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

44

Validating Transactions

Validation is typically implemented by looking at transaction Tj’s

• Read Set RS(Tj) (attributes read by Tj) and

• Write Set WS(Tj) (attributes written by Tj).

Optimistic Concurrency Control

Compare Tj against all committed transactions Ti.Check succeeds if

Ti committed before Tj startedor

(RS(Tj) ∪WS(Tj)) ∩WS(Ti) = ∅ .

Page 55: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

45

Optimistic Concurrency Control in IBM DB2

• DB2 V9.5 provides SQL-level constructs that enable databaseapplications to implement optimistic concurrency control:

• RID(r): return row identifier for row r,• ROW CHANGE TOKEN FOR r: unique number reflecting

the time row r has last been updated.

Optimistic concurrency control

1 db2 => SELECT * FROM EMPLOYEES2

3 ID NAME DEPT SALARY4 ----------- ---- ---- -----------5 1 Alex DE 3006 2 Bert DE 1007 3 Cora DE 2008 4 Drew US 2009 5 Erik US 400

10

11 db2 => SELECT E.NAME, E.SALARY,12 RID(E) AS RID, ROW CHANGE TOKEN FOR E AS TOKEN13 FROM EMPLOYEES E14 WHERE E.NAME = ’Erik’15

16 NAME SALARY RID TOKEN17 ---- ----------- -------------------- --------------------------18 Erik 400 16777224 74904229642240

Page 56: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

46

Optimistic Concurrency Control in IBM DB2

Optimistic concurrency control

• The ’Erik’ row belongs to our read set. Save its RID and TOKEN

values to perform validation later.

• (. . . Time passes . . . ) Now try to save our changes to the row.Perform BOCC-style validation:

1 db2 => UPDATE EMPLOYEES E2 SET E.SALARY = 4503 WHERE RID(E) = 16777224 -- identify row4 AND ROW CHANGE TOKEN FOR E = 74904229642240 -- row changed?5

6 SQL0100W No row was found for FETCH, UPDATE or DELETE; or the7 result of a query is an empty table. SQLSTATE=020008

9 db2 => SELECT E.ID, E.NAME10 RID(E) AS RID, ROW CHANGE TOKEN FOR E AS TOKEN11 FROM EMPLOYEES E12

13 ID NAME RID TOKEN14 ----------- ---- -------------------- --------------------15 1 Alex 16777220 7490422964224016 2 Bert 16777221 7490422964224017 3 Cora 16777222 7490422964224018 4 Drew 16777223 7490422964224019 5 Erik 16777224 141378732653941710

Page 57: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

47

Multi-Version Concurrency Control

Looking back at the concurrency control strategies discussed upto this point, we have seen

1 Wait Mechanisms, i.e., locks and the associated two-phaselocking protocol,

2 Rollback Mechanisms, i.e., a conditional write phase thatmakes it trivial to take back any changes made by atransaction.

We now add

3 Timestamp Mechanisms that use a DBMS-wide clock toorder transactions and decide visibility of rows.

The resulting Multi-Version Concurrency Control (MVCC)protocol is lock-less but comes with a space overhead (thatrequires garbage collection of rows).

Page 58: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

47

Multi-Version Concurrency Control

Looking back at the concurrency control strategies discussed upto this point, we have seen

1 Wait Mechanisms, i.e., locks and the associated two-phaselocking protocol,

2 Rollback Mechanisms, i.e., a conditional write phase thatmakes it trivial to take back any changes made by atransaction.

We now add

3 Timestamp Mechanisms that use a DBMS-wide clock toorder transactions and decide visibility of rows.

The resulting Multi-Version Concurrency Control (MVCC)protocol is lock-less but comes with a space overhead (thatrequires garbage collection of rows).

Page 59: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

48

MVCC: Timestamps

• In MVCC, each transaction Ti is assigned a timestamp ti thatrepresents the point in time when Ti started.

• Can implement timestamps based on• actual system clock (resolution, uniqueness, portability

across OSs?),or

• a DBMS-internal counter used to assign transaction IDs(xid).

• Timestamp requirements:1 unique: ti 6= tj if i 6= j,2 ordered: ti < tj if Ti has started before Tj.

MVCC: Semantics

Under MVCC, a transaction Ti operates on the consistent state ofthe database that was current at time ti.

Page 60: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

48

MVCC: Timestamps

• In MVCC, each transaction Ti is assigned a timestamp ti thatrepresents the point in time when Ti started.

• Can implement timestamps based on• actual system clock (resolution, uniqueness, portability

across OSs?),or

• a DBMS-internal counter used to assign transaction IDs(xid).

• Timestamp requirements:1 unique: ti 6= tj if i 6= j,2 ordered: ti < tj if Ti has started before Tj.

MVCC: Semantics

Under MVCC, a transaction Ti operates on the consistent state ofthe database that was current at time ti.

Page 61: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

49

MVCC: Versions and Snapshots

In a concurrent DBMS, operations can conflict if they write thesame database object (recall relation =). Thus:

MVCC: Versions

Under MVCC, multiple versions of the same database objectmay exist at one time. Different transactions may read/writedifferent (not necessarily the most recent) object versions.

MVCC uses snapshots to identify exactly which version of eachdatabase object are visible to a transaction:

MVCC: Snapshot

To take a snapshot, gather the following information:

• the highest xid of all committed transactions,

• a list of xids of all transactions currently executing.

Typically, a snapshot is taken at the time the transaction starts.

Page 62: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

49

MVCC: Versions and Snapshots

In a concurrent DBMS, operations can conflict if they write thesame database object (recall relation =). Thus:

MVCC: Versions

Under MVCC, multiple versions of the same database objectmay exist at one time. Different transactions may read/writedifferent (not necessarily the most recent) object versions.

MVCC uses snapshots to identify exactly which version of eachdatabase object are visible to a transaction:

MVCC: Snapshot

To take a snapshot, gather the following information:

• the highest xid of all committed transactions,

• a list of xids of all transactions currently executing.

Typically, a snapshot is taken at the time the transaction starts.

Page 63: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

50

MVCC: Row Timestamps

Database Object ≡ Row

PostgreSQL implements MVCC at a granularity of rows: multipleversions of the same row may exist. Adopt this model in whatfollows.

• To help decide whether a particular row version is includedin (or excluded from) a snapshot, attach to each row versiontwo virtual/hidden attributes:

1 xmin: the xid of the transaction that created this row,2 xmax: the xid of the transaction that deleted this row.

• Row updates are modellled as the two-step operation rowdeletion, then creation.

MVCC: No Physical Deletion!

Note: 2 implies that rows are not actually physically deleted.Instead, their xmax attribute is modified to record thedeleting/updating transaction (⇒ eventual row garbage).

Page 64: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

50

MVCC: Row Timestamps

Database Object ≡ Row

PostgreSQL implements MVCC at a granularity of rows: multipleversions of the same row may exist. Adopt this model in whatfollows.

• To help decide whether a particular row version is includedin (or excluded from) a snapshot, attach to each row versiontwo virtual/hidden attributes:

1 xmin: the xid of the transaction that created this row,2 xmax: the xid of the transaction that deleted this row.

• Row updates are modellled as the two-step operation rowdeletion, then creation.

MVCC: No Physical Deletion!

Note: 2 implies that rows are not actually physically deleted.Instead, their xmax attribute is modified to record thedeleting/updating transaction (⇒ eventual row garbage).

Page 65: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · ·

X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 66: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · ·

X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 67: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · · X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 68: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · · X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 69: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · · X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 70: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · · X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 71: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

51

MVCC: Row Visibility (1)

Example (Decide Row Visibility)

• Current snapshot:2 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 — · · · X

2xmin xmax · · · data · · ·

50 — · · ·

3xmin xmax · · · data · · ·110 — · · ·

2For simplicity: assume that all other xids have committed (and not rolledback their work).

Page 72: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · ·

X

3xmin xmax · · · data · · ·

30 110 · · ·

X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 73: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · ·

X

3xmin xmax · · · data · · ·

30 110 · · ·

X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 74: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · ·

X

3xmin xmax · · · data · · ·

30 110 · · ·

X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 75: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · · X

3xmin xmax · · · data · · ·

30 110 · · ·

X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 76: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · · X

3xmin xmax · · · data · · ·

30 110 · · ·

X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 77: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · · X

3xmin xmax · · · data · · ·

30 110 · · · X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 78: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

52

MVCC: Row Visibility (2)

Example (Decide Row Visibility)

• Current snapshot: 〈 100︸ ︷︷ ︸highest committed xid

, [25, 50, 75]︸ ︷︷ ︸currently active xids

• Row visible in snapshot?

1xmin xmax · · · data · · ·

30 80 · · ·

2xmin xmax · · · data · · ·

30 75 · · · X

3xmin xmax · · · data · · ·

30 110 · · · X

• Given the current state of the system, may row 1 beconsidered garbage that can be collected?

Page 79: Chapter 11 - db.inf.uni-tuebingen.de · Chapter 11 Transaction Management Concurrent and Consistent Data Access Architecture and Implementation of Database Systems Summer 2014 Torsten

TransactionManagement

Torsten Grust

ACID Properties

Anomalies

The Scheduler

Serializability

Query Scheduling

Locking

Two-Phase Locking

OptimisticConcurrency Protocol

Multi-VersionConcurrency Control

53

MVCC: Garbage Collection

• The creation of new row versions during UPDATE (rather thanreplacing the existing row) requires the reclamation ofstorage space used by old row versions.

• Delay such row garbage collection until the old versions areguaranteed to be invisible to all current and futuretransactions.

Delayed Row Garbage Collection

Exactly when is it safe to declare a row as garbage and mark it forcollection?

Row Cleanup

• On-demand cleanup of a single page when page isaccessed during SELECT, UPDATE, DELETE.

• Bulk cleanup by scheduled auto-vacuum process or via anexplicit VACUUM command.