distributed database systems - pdfs.semanticscholar.org · gtm & sm fully integrated nodes...

58
1 Distributed Database Systems Vera Goebel Department of Informatics University of Oslo 2011

Upload: hoangdieu

Post on 19-Sep-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

1

Distributed Database Systems

Vera Goebel

Department of Informatics

University of Oslo

2011

Page 2: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

2

Contents • Review: Layered DBMS Architecture

• Distributed DBMS Architectures

– DDBMS Taxonomy

• Client/Server Models

• Key Problems of Distributed DBMS

– Distributed data modeling

– Distributed query processing & optimization

– Distributed transaction management

• Concurrency Control

• Recovery

Page 3: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

3

Functional Layers of a DBMS

Database

Applications

Data Model / View Management

Semantic Integrity Control / Authorization

Query Processing and Optimization

Storage Structures

Buffer Management

Concurrency Control / Logging

Interface

Control

Compilation

Execution

Data Access

Consistency

Tra

nsaction M

anagem

ent

Page 4: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

4

Dependencies among DBMS components

Log component

(with savepoint mgmt)

Lock

component

System Buffer

Mgmt

Central

Components

Access Path

Mgmt

Transaction

Mgmt

Sorting

component

Indicates a dependency

Page 5: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

5

Centralized DBS

• logically integrated

• physically centralized

network

T1 T2

T4

T3

T5

DBS

Traditionally: one large mainframe DBMS + n “stupid” terminals

Page 6: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

6

Distributed DBS

• Data logically integrated (i.e., access based on one schema)

• Data physically distributed among multiple database nodes

• Processing is distributed among multiple database nodes

network

T1 T2

T3

DBS1

DBS3

DBS2

Traditionally: m mainframes for the DBMSs + n terminals

Why a Distributed DBS?

Performance via parallel execution

- more users

- quick response

More data volume

- max disks on multiple nodes

Page 7: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

7

Centralized

DBMS

Distributed

DBMS

Homogeneous DBMS

Centralized

Heterogeneous DBMS

Homogeneous DBMS

Distributed

Heterogeneous DBMS

DBMS Implementation Alternatives

Distribution

Heterogeneity

Autonomy Centralized Homog.

Distributed Homog.

Federated DBMS

Distributed Heterog.

Federated DBMS

Centralized Heterog.

Federated DBMS

Client/Server Distribution

Centralized Homog.

Distributed

Multi-DBMS

Distributed Heterog.

Multi-DBMS

Centralized Heterog.

Multi-DBMS

Federated DBMS Multi-DBMS

Page 8: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

8

Common DBMS Architectural Configurations

No Autonomy Federated Multi

N1

N3 N4

N2

D3

D2

D4

D1

D3 D1 D2 D4

GTM & SM

Fully integrated nodes

Complete cooperation on:

- Data Schema

- Transaction Mgmt

Fully aware of all other nodes in the DDBS

Independent DBMSs

Implement some “cooperation functions”:

- Transaction Mgmt

- Schema mapping

Aware of other DBSs in the federation

Fully independent DBMSs

Independent, global manager tries to coordinate the DBMSs using only existing DBMS services

Unaware of GTM and other DBSs in the MultiDBS

Page 9: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

9

Parallel Database System Platforms (Non-autonomous Distributed DBMS)

Shared Everything

Disk 1 Disk p ...

Fast Interconnection

Network

Proc1 ProcN Proc2 ... Mem1

Mem2

MemM

. . .

Typically a special-purpose hardware

configuration, but this can be emulated

in software.

Shared Nothing

Disk 1 Disk N ...

...

Fast Interconnection Network

Proc1 ProcN

Mem1 MemN

Can be a special-purpose hardware

configuration, or a network of workstations.

Page 10: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

10

Distributed DBMS

• Advantages: – Improved performance

– Efficiency

– Extensibility (addition of new nodes)

– Transparency of distribution • Storage of data

• Query execution

– Autonomy of individual nodes

• Problems: – Complexity of design and implementation

– Data consistency

– Safety

– Failure recovery

Page 11: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

11

Client/Server Database Systems

The “Simple” Case of Distributed

Database Systems?

Page 12: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

12

Client/Server Environments

• objects stored and administered on server

• objects processed (accessed and modified) on

workstations [sometimes on the server too]

• CPU-time intensive applications on workstations

– GUIs, design tools

• Use client system local storage capabilities

• combination with distributed DBS services

=> distributed server architecture

data (object) server + n smart clients (workstations)

Page 13: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

13

Clients with Centralized Server Architecture

Disk 1 Disk n

...

...

Data

Server

database

local communication

network

interface

database functions

workstations

Page 14: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

14

Clients with Distributed Server Architecture

Disk 1 Disk n

...

...

Data Server 1

database

local communication network

interface

local data mgnt. functions

workstations

Data Server m

distributed DBMS ...

Disk 1 Disk p

... database

interface

local data mgnt. functions

distributed DBMS

Page 15: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

15

Clients with Data Server Approach

Disk 1 Disk n

...

...

Application

Server

Data

Server

database

communication

channel

user interface

query parsing

data server interface

application server interface

database functions

“3-tier client/server”

Page 16: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

16

In the Client/Server DBMS Architecture, how are the DB services organized?

There are several architectural options!

Page 17: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

17

Client/Server Architectures

Relational

Application

client process

server process

SQL cursor

management

Object-Oriented

Application

client process

server process

objects,

pages, or

files

object/page cache

management

DBMS

Page 18: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

18

Object Server Architecture

Object

Cache

Page

Cache

Log/Lock

Manager Object

Manager

Storage Allocation

and I/O

objects

object references

queries

method calls

locks

log records

Object

Cache

Application

Object Manager

client process

server process

File/Index

Manager

Page Cache

Manager

Database

Page 19: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

19

Object Server Architecture - summary • Unit of transfer: object(s)

• Server understands the object concept and can execute methods

• most DBMS functionality is replicated on client(s) and server(s)

• Advantages: - server and client can run methods, workload can be balanced - simplifies concurrency control design (centralized in server) - implementation of object-level locking - low cost for enforcing “constraints” on objects

• Disadvantages/problems: - remote procedure calls (RPC) for object references - complex server design - client cache consistency problems - page-level locking, objects get copied multiple times, large objects

Page 20: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

20

Page Server Architecture

Page

Cache

database

Log/Lock

Manager Page Cache

Manager

Storage Allocation

and I/O

pages

page references

locks

log records

server process

client process

Object

Cache

Page

Cache

Application

Object Manager

File/Index

Manager

Page Cache

Manager

Page 21: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

21

Page Server Architecture - summary

• unit of transfer: page(s)

• server deals only with pages (understands no object semantics)

• server functionality: storage/retrieval of page(s), concurrency control, recovery

• advantages: - most DBMS functionality at client - page transfer -> server overhead minimized - more clients can be supported - object clustering improves performance considerably

• disadvantages/problems:

- method execution only on clients

- object-level locking

- object clustering

- large client buffer pool required

Page 22: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

22

File Server Architecture

database

Log/Lock

Manager

Space

Allocation

locks

log records

server process

pages

page references

NFS

Object

Cache

Application

Object Manager

client process

File/Index

Manager

Page Cache

Manager

Page

Cache

Page 23: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

23

File Server Architecture - summary

• unit of transfer: page(s)

• simplification of page server, clients use remote file system (e.g., NFS) to read and write DB page(s) directly

• server functionality: I/O handling, concurrency control, recovery

• advantages: - same as for page server - NFS: user-level context switches can be avoided - NFS widely-used -> stable SW, will be improved

• disadvantages/problems:

- same as for page server

- NFS write are slow

- read operations bypass the server -> no request combination

- object clustering

- coordination of disk space allocation

Page 24: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

24

Client sends 1 msg

per lock to server;

Client waits;

Server replies with

ACK or NACK.

Client sends 1 msg

per lock to the server;

Client continues;

Server invalidates

cached copies at

other clients.

Client sends all write

lock requests to the

server at commit time;

Client waits;

Server replies when

all cached copies are

freed.

Client sends object

status query to server

for each access;

Client waits;

Server replies.

Client sends 1 msg

per lock to the server;

Client continues;

After commit, the

server sends updates

to all cached copies.

Client sends all write

lock requests to the

server at commit time;

Client waits;

Server replies based

on W-W conflicts only.

Cache Consistency in Client/Server Architectures

Avoidance-

Based

Algorithms

Detection-

Based

Algorithms

Synchronous Asynchronous Deferred

Best Performing Algorithms

Page 25: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

25

Comparison of the 3 Client/Server Architectures

• Simple server design

• Complex client design

• Fine grained concurrency control difficult

• Very sensitive to client buffer pool size and clustering

• Complex server design

• “Relatively” simple client design

• Fine-grained concurrency control

• Reduces data movement, relatively insensitive to clustering

• Sensitive to client buffer pool size

Page & File Server Object Server

Conclusions:

• No clear winner

• Depends on object size and application’s object access pattern

• File server ruled out by poor NFS performance

Page 26: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

26

Problems in Distributed DBMS Services Distributed Database Design

Distributed Directory/Catalogue Mgmt

Distributed Query Processing and Optimization

Distributed Transaction Mgmt

– Distributed Concurreny Control

– Distributed Deadlock Mgmt

– Distributed Recovery Mgmt

influences

query processing

directory management

distributed DB design reliability (log)

concurrency control

(lock)

deadlock management

transaction

management

Page 27: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

27

Distributed Storage in Relational DBMSs

• horizontal fragmentation: distribution of “rows”, selection

• vertical fragmentation: distribution of “columns”, projection

• hybrid fragmentation: ”projected columns” from ”selected rows”

• allocation: which fragment is assigned to which node?

• replication: multiple copies at different nodes, how many copies?

• Evaluation Criteria

– Cost metrics for: network traffic, query processing, transaction mgmt

– A system-wide goal: Maximize throughput or minimize latency

• Design factors:

– Most frequent query access patterns

– Available distributed query processing algorithms

Page 28: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

28

Distributed Storage in OODBMSs

• Must fragment, allocate, and replicate object data among nodes

• Complicating Factors: – Encapsulation hides object structure

– Object methods (bound to class not instance)

– Users dynamically create new classes

– Complex objects

– Effects on garbage collection algorithm and performance

• Approaches: – Store objects in relations and use RDBMS strategies to distribute data

• All objects are stored in binary relations [OID, attr-value]

• One relation per class attribute

• Use nested relations to store complex (multi-class) objects

– Use object semantics

• Evaluation Criteria: – Affinity metric (maximize)

– Cost metric (minimize)

Page 29: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

29

Node1: : Node2

Node1: : Node2

Horizontal Partitioning using Object Semantics

• Divide instances of a class into multiple groups

– Based on selected attribute values

– Based on subclass designation

Class C

Subcl X Subcl Y

Attr A < 100 Attr A >= 100

• Fragment an object’s complex attributes from it’s simple attributes

• Fragment an object’s attributes based on typical method invocation sequence

– Keep sequentially referenced attributes together

Page 30: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

30

Vertical Partitioning using Object Semantics

• Fragment object attributes based on the

class hierarchy

Breaks object encapsulation?

Instances of Class X

Instances of Subclass subX

Instances of Subclass subsubX

Fragment #1 Fragment #2

Fragment #3

Page 31: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

31

Path Partitioning using Object Semantics

• Use the object composition graph for complex

objects

• A path partition is the set of objects

corresponding to instance variables in the

subtree rooted at the composite object

– Typically represented in a ”structure index”

Composite

Object OID {SubC#1-OID, SubC#2-OID, …SubC#n-OID}

Page 32: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

32

More Issues in OODBMS Fragmentation

Local Object Data

Remote

Methods

Remote Object Data

Local

Methods

• Replication Options – Objects

– Classes (collections of objects)

– Methods

– Class/Type specifications

Good performance;

Issue: Replicate methods so they

are local to all instances?

Send data to remote site, execute, and return result OR fetch the method and execute;

Issues: Time/cost for two transfers? Ability to execute locally?

Fetch the remote data and execute

the methods;

Issue: Time/cost for data transfer?

Send additional input values via

RPC, execute and return result;

Issues: RPC time?

Execution load on remote node?

Page 33: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

33

Distributed Directory and Catalogue Management

• Directory Information:

– Description and location of records/objects

• Size, special data properties (e.g.,executable, DB type,

user-defined type, etc.)

• Fragmentation scheme

– Definitions for views, integrity constraints

Issues: bottleneck, unreliable

Issues: complicated access protocol, consistency

Issues: consistency, storage overhead

• Options for organizing the directory:

– Centralized

– Fully replicated

– Partitioned

– Combination of partitioned and replicated e.g., zoned,

replicated zoned

Page 34: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

34

Distributed Query Processing and Optimization

• Construction and execution of query plans, query optimization

• Goals: maximize parallelism (response time optimization) minimize network data transfer (throughput optimization)

• Basic Approaches to distributed query processing:

Pipelining – functional decomposition

Parallelism – data decomposition

Node 1 Rel A

Select A.x > 100

Node 2 Rel B

Join A and B on y

Node 1

Node 2

Frag A.1

Frag A.2

Select A.x > 100

Processing Timelines

Node 1:

Node 2:

Node 1:

Node 3:

Processing Timelines

Node 2:

Node 3

Union

Page 35: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

35

Creating the Distributed Query Processing Plan

• Factors to be considered:

- distribution of data

- communication costs

- lack of sufficient locally available information

• 4 processing steps:

(1) query decomposition

(2) data localization control site (uses global information)

(3) global optimization

(4) local optimization local sites (use local information)

Page 36: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

36

Generic Layered Scheme for Planning a Dist. Query calculus query on distributed objects

query decomposition

algebraic query on distributed objects

Global

schema

data localization

fragment query

Fragment

schema

global optimization

optimized fragment query

with communication operations

Statistics on

fragments

local optimization

optimized local queries

Local

schema

contr

ol site

local site

s

This approach was

initially designed for

distributed relational

DBMSs.

It also applies to

distributed OODBMSs,

Multi-DBMSs, and

distributed Multi-

DBMSs.

Page 37: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

37

Distributed Query Processing Plans in OODBMSs

• Two forms of queries – Explicit queries written in OQL

– Object navigation/traversal Reduce to logical

algebraic expressions } • Planning and Optimizing

– Additional rewriting rules for sets of objects • Union, Intersection, and Selection

– Cost function • Should consider object size, structure, location, indexes, etc.

• Breaks object encapsulation to obtain this info?

• Objects can ”reflect” their access cost estimate

– Volatile object databases can invalidate optimizations • Dynamic Plan Selection (compile N plans; select 1 at runtime)

• Periodic replan during long-running queries

Page 38: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

38

Distributed Query Optimization • information needed for optimization (fragment statistics):

- size of objects, image sizes of attributes

- transfer costs

- workload among nodes

- physical data layout

- access path, indexes, clustering information

- properties of the result (objects) - formulas for estimating the

cardinalities of operation results

• execution cost is expressed as a weighted combination of I/O,

CPU, and communication costs (mostly dominant).

total-cost = CCPU * #insts + CI/O * #I/Os + CMSG * #msgs + CTR * #bytes

Optimization Goals: response time of single transaction or system throughput

Page 39: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

39

Transaction Manager

Distributed Transaction Managment • Transaction Management (TM) in centralized DBS

– Achieves transaction ACID properties by using: • concurrency control (CC)

• recovery (logging)

• TM in DDBS

– Achieves transaction ACID properties by using:

Concurrency Control

(Isolation)

Commit/Abort Protocol

(Atomicity)

Recovery Protocol

(Durability)

Replica Control Protocol

(Mutual Consistency)

Log Manager

Buffer Manager

Strong algorithmic

dependencies

Page 40: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

40

Classification of Concurrency Control Approaches

CC Approaches

Pessimistic Optimistic

Locking Timestamp

Ordering Hybrid Locking

Timestamp

Ordering

Centralized

Primary

Copy

Distributed

Basic

Multi-

Version

Conser-

vative

phases of pessimistic

transaction execution

validate read compute write

phases of optimistic

transaction execution

read compute validate write

Isolation

Page 41: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

41

Two-Phase-Locking Approach (2PL)

BEGIN LOCK END

POINT

Obtain Lock

Release Lock

Transaction

Duration N

um

be

r o

f lo

cks 2PL Lock Graph

Strict 2PL Lock Graph

BEGIN END Period of data item use

Obtain Lock

Release Lock

Transaction

Duration

Nu

mb

er

of lo

cks

Isolation

Page 42: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

42

Communication Structure of Centralized 2PL

Data Processors at Coordinating TM Central Site LM

participating sites

2

Lock Granted

3

Operation

4

End of Operation

5

Release Locks

1

Lock Request

Isolation

Page 43: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

43

Communication Structure of Distributed 2PL

Data Processors at Participating LMs Coordinating TM

participating sites

1

Lock Request 2

Operation

3

End of Operation

4

Release Locks

Isolation

Page 44: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

44

Distributed Deadlock Management

Example

Site X

Site Y

T1 x

holds lock Lc

T2 y

holds lock Ld

T1 x needs a

waits for T1

on site y

to complete

T2 y needs b

waits for T2

on site x

to complete

T1 y

waits for T2 y

to release Ld

T2 x

waits for T1 x

to release Lc

holds lock La

holds lock Lb

Isolation

Distributed Waits-For Graph

- requires many messages

to update lock status

- combine the “wait-for”

tracing message with lock

status update messages

- still an expensive algorithm

Page 45: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

45

Communication Structure of Centralized 2P Commit Protocol

Coordinating TM Participating Sites

1

Prepare

2

Vote-Commit or Vote-Abort

Make local commit/abort decision Write log entry

3

Global-Abort or Global-Commit

Count the votes If (missing any votes or any Vote-Abort) Then Global-Abort Else Global-Commit

4

ACK

Write log entry

Who participates?

Depends on the CC alg.

Other communication

structures are

possible:

– Linear

– Distributed

Atomicity

Page 46: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

46

State Transitions for 2P Commit Protocol

Advantages:

- Preserves atomicity

- All processes are

“synchronous within

one state transition”

Coordinating TM Participating Sites

2

Vote-Commit

or Vote-Abort

1

Prepare

3

Global-Abort or

Global-Commit

4

ACK

Initial

Initial Wait

Commit Abort

Commit

Ready

Abort

Disadvantages:

- Many, many messages

- If failure occurs,

the 2PC protocol blocks!

Attempted solutions for the

blocking problem:

1) Termination Protocol

2) 3-Phase Commit

3) Quorum 3P Commit

Atomicity

Page 47: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

47

Failures in a Distributed System

Types of Failure: – Transaction failure

– Node failure

– Media failure

– Network failure • Partitions each containing 1 or more sites

Who addresses the problem?

Termination Protocols

Modified Concurrency Control

& Commit/Abort Protocols

Recovery Protocols,

Termination Protocols,

& Replica Control Protocols

Issues to be addressed: – How to continue service

– How to maintain ACID properties while providing continued service

– How to ensure ACID properties after recovery from the failure(s)

Page 48: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

48

Termination Protocols

Timeout states:

Coordinator: wait, commit, abort

Participant: initial, ready

Use timeouts to detect

potential failures that could

block protocol progress

Coordinating TM Participating Sites

2

Vote-Commit

or Vote-Abort

4

ACK

3

Global-Abort or

Global-Commit

1

Prepare

Initial

Wait Initial

Commit Abort

Commit Abort

Ready

Coordinator Termination Protocol:

Wait – Send global-abort

Commit or Abort – BLOCKED!

Participant Termination Protocol:

Ready – Query the coordinator If timeout then query other participants; If global-abort | global-commit then proceed and terminate else BLOCKED!

Atomicity, Consistency

Page 49: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

49

Replica Control Protocols

• Update propagation of committed write operations

Period of data use

Nu

mb

er

of lo

cks

Obtain Locks

COMMIT POINT

Release

Locks

BEGIN END

2-Phase Commit

STRICT UPDATE

LAZY UPDATE

Consistency

Page 50: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

50

Strict Replica Control Protocol

• Read-One-Write-All (ROWA)

• Part of the Concurrency Control Protocol

and the 2-Phase Commit Protocol

– CC locks all copies

– 2PC propagates the updated values with 2PC

messages (or an update propagation phase is

inserted between the wait and commit states

for those nodes holding an updateable value).

Consistency

Page 51: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

51

Lazy Replica Control Protocol

• Propagates updates from a primary node.

• Concurrency Control algorithm locks the primary copy node (same node as the primary lock node).

• To preserve single copy semantics, must ensure that a transaction reads a current copy. – Changes the CC algorithm for read-locks

– Adds an extra communication cost for reading data

• Extended transaction models may not require single copy semantics.

Consistency

Page 52: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

52

Recovery in Distributed Systems

Select COMMIT or ABORT (or blocked) for each interrupted subtransaction

Implementation requires knowledge of:

– Buffer manager algorithms for writing updated data

from volatile storage buffers to persistent storage

– Concurrency Control Algorithm

– Commit/Abort Protocols

– Replica Control Protocol

Atomicity, Durability

Abort Approaches:

Undo – use the undo/redo log to backout all the writes that were actually

performed

Compensation – use the transaction log to select and execute ”reverse”

subtransactions that semantically undo the write operations.

Commit Approaches:

Redo – use the undo/redo log to perform all the write operations again

Retry – use the transaction log to redo the entire subtransaction (R + W)

Page 53: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

53

Network Partitions in Distributed Systems

Issues:

Termination of interrupted transactions

Partition integration upon recovery from a network failure

– Data availability while failure is ongoing

network

N1 N2

N6

N3

N5

N4

Partition #1 Partition #3

N8

N7

Partition #2

Page 54: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

54

Data Availability in Partitioned Networks

• Concurrency Control model impacts data availability.

• ROWA – data replicated in multiple partitions is not

available for reading or writing.

• Primary Copy Node CC – can execute transactions if

the primary copy node for all of the read-set and all

of the write-set are in the client’s partition.

Availability is still very limited . . . We need a new idea!

Page 55: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

55

Quorums

• Quorum – a special type of majority

• Use quorums in the Concurrency Control, Commit/Abort, Termination, and Recovery Protocols – CC uses read-quorum & write-quorum

– C/A, Term, & Recov use commit-quorum & abort-quorum

• Advantages: – More transactions can be executed during site failure and

network failure (and still retain ACID properties)

• Disadvantages: – Many messages are required to establish a quorum

– Necessity for a read-quorum slows down read operations

– Not quite sufficient (failures are not “clean”)

ACID

Page 56: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

56

Read-Quorums and Write-Quorums

• The Concurrency Control Algorithm serializes valid transactions in a partition. It must obtain – A read-quorum for each read operation

– A write-quorum for each write operation

• Let N=total number of nodes in the system

• Define the size of the read-quorum Nr and the size of the write-quorum Nw as follows: – Nr + Nw > N

– Nw > (N/2)

• When failures occur, it is possible to have a valid read quorum and no valid write quorum

Simple Example:

N = 8 Nr = 4 Nw = 5

Isolation

Page 57: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

57

Commit-Quorums and Abort-Quorums

• The Commit/Abort Protocol requires votes from all

participants to commit or abort a transaction.

– Commit a transaction if the accessible nodes can form a commit-

quorum

– Abort a transaction if the accessible nodes can form an abort-quorum

– Elect a new coordinator node (if necessary)

– Try to form a commit-quorum before attempting

to form an abort-quorum

• Let N=total number of nodes in the system

• Define the size of the commit-quorum Nc and the size of

abort-quorum Na as follows:

– Na + Nc > N; 0 Na, Nc N

Simple Examples:

N = 7 Nc = 4 Na = 4

N = 7 Nc = 5 Na = 3

ACID

Page 58: Distributed Database Systems - pdfs.semanticscholar.org · GTM & SM Fully integrated nodes Complete cooperation on: - Data Schema ... server process cursor SQL management Object-Oriented

58

Conclusions

• Nearly all commerical relational database systems offer some

form of distribution

– Client/server at a minimum

• Only a few commercial object-oriented database systems

support distribution beyond N clients and 1 server

• Future research directions:

– Distributed data architecture and placement schemes (app-influenced)

– Distributed databases as part of Internet applications

– Continued service during disconnection or failure

• Mobile systems accessing databases

• Relaxed transaction models for semantically-correct continued operation