replication (1). topics r why replication? r consistency models – how do we reason about the...

Replication (1)

Topics

Why Replication? Consistency Models – How do we reason about

the consistency of the “global state”? Data-centric consistency Client-centric consistency

We will examine consistency protocols which describe an implementation of a specific consistency model.

Other Implementation Issues Examples

Readings

Van Steen and Tanenbaum: 6.1, 6.2 and 6.3, 6.4

Coulouris: 11,14

Why Replicate? Replication refers to the maintenance of

copies at multiple site Reliability

If one replica is unavailable or crashes, use another

Avoid single points of failure Performance

Placing copies of data close to the processes using them can improve performance through reduction of access time.

If there is only one copy, then the server could become overloaded.

Common Replication Examples

DNA naming service Web browsers often locally store a copy

of a previously fetched web page. This is referred to as caching a web

page. Replication of a database Replication of game state

Replication Problem

Multiple copies may lead to consistency problems.

Whenever a copy is modified, that copy becomes different from the rest.

Modifications have to be carried out on all copies to ensure consistency.

The type of application has an impact on the consistency requirements needed and thus on the implementation.

Consistency Model

Some applications (e.g., banking) require That update operations are performed in the same

order at each copy. This is referred to as sequential consistency. Possible Implementation: Using Lamport’s clocks

Other applications (e.g., bulletin board) require That if one update, U1, causes another update, U2, to

occur then U1 should be executed before U2 at each copy.

This is referred to as causal consistency Possible Implementation: Using vector clocks

Consistency Model

Observe that although there is replication the type of application indicates the type of consistency model to be used.

A consistency model describes the rules to be used in updating replicated data

There are more consistency models than sequential and causal.

Other Consistency Models: FIFO Strict

FIFO Consistency

Writes done by a single process are seen by all other processes in the order in which they were issued

… but writes from different processes may be seen in a different order by different processes.

i.e., there are no guarantees about the order in which different processes see writes, except that two or more writes from a single source must arrive in order.

FIFO Consistency

Caches in web browsers All updates are updated by page owner. No conflict between two writes Note: If a web page is updated twice in a very

short period of time then it is possible that the browser doesn’t see the first update.

Implementation: Each process adds the following to an update

message: (process id, sequence number) Each other process applies the update

messages in the order received from a single process.

Strict Consistency

Strict consistency is defined as follows: Read is expected to return the value resulting

from the most recent write operation Assumes absolute global time All writes are instantaneously visible to all

Suppose that process pi updates the value of x to 5 from 4 at time t1 and multicasts this value to all replicas Process pj reads the value of x at t2 (t2 > t1). Process pj should read x as 5 regardless of

the size of the (t2-t1) interval.

Strict Consistency

What if t2-t1 = 1 nsec and the optical fibre between the host machines with the two processes is 3 meters. The update message would have to travel at

10 times the speed of light Not allowed by Einsten’s special theory of

relativity. Can’t have strict consistency

Implementation Options: Sequential Consistency

We saw how to use Lamport’s logical clocks for sequential consistency.

Another option is to have a centralized processor that is a sequencer.


We saw how to use Lamport’s logical clocks for sequential consistency.

Another option is to have a centralized processor that is a sequencer.

Each update request it sent to the sequencer which Assigns the request a unique sequence

number Update request is forwarded to each replica Operations are carried out in the order of

their sequence number


The use of a sequencer also does not solve the scalability problem. It may become a performance bottleneck. What if it goes down?

A combination of Lamport timestamps and sequencers may be necessary.

The approach is summarized as follows: Each process has a unique identifier, pi, and keeps a

sent message counter ci. The process identifier and message counter uniquely identify a message.

Active processes (or a sequencer) keep an extra counter: ti. This is called the ticket number. A ticket is a triplet (pi, ti, (pj, cj)).

All other processes are passive


Approach Summary (cont) Passive processes (non-sequencer) send their

messages to their sequencer. Lamport’s totally ordered multicast algorithm is used

among the sequencers to determine the order of update operations.

When an operation is allowed, each sequencer sends the ticket to its associated passive processes. It is assumed that the passive process receives these tickets in the order sent.


Approach Summary (cont) If a sequencer terminates abnormally, then

one of the passive processes associated with it can become the new sequencer.

An election algorithm may be used to choose the new sequencer.


Let’s say that we have 6 processes: p1,p2,p3,p4,p5,p6

Assume that p1,p2 are sequencers; p3,p4 are associated with p1 and p5,p6 are associated with p2

Let’s say that p3 sends a message which is identified by (p3 , 1).

p1 generates a ticket as follows: (p1, 1, (p3 , 1)) The ticket number is generated using the

Lamport clock algorithm.

Ticket number


Let’s say that p5 sends a message which is identified by (p5 , 1).

p2 generates a ticket as follows: (p2, 1, (p5 , 1))

Which update gets done first? Basically, p1,p2 will apply Lamport’s algorithm for totally ordered multicast.

When an update operation is allowed to proceed, the sequencers send messages to their associated processes.

Data-Centric Consistency Models

The consistency models just discussed are called data-centric consistency models.

Assumptions: Concurrently processes may be

simultaneously updating Updates need to be propagated quickly.

Eventual Consistency In the banking example an account can have

many updates by different sources e.g., person at ATM, bank adding interest; Updates should be “immediate”

Many applications: One or few processes perform updates

Example: DNS DNS name space is divided into domains. Each domain has its own naming authority Only that authority is allowed to update its part of the name

space e.g., change the IP address associated with a host name. This implies that there is no write-write conflict Does the update have to be done immediately? No. Can propagate an update in a lazy fashion i.e.,

• Often acceptable to propagate an update only after some time has passed

Eventual Consistency Example: WWW

Web pages are updated by a single authority. Web pages are cached by browsers for efficiency The cached page that is returned to the requesting

client may be an older version compared to the one available at the actual web server.

This inconsistency is usually acceptable.

Some applications can tolerate relatively high inconsistency.

Eventual consistency requires only that updates are guaranteed to propagate to all replicas.

Eventual Consistency

The principle of a mobile user accessing different replicas of a distributed database.

Eventual Consistency The mobile user accesses the database by

connecting to one of the replicas in a transparent way.

The application running on the user’s portable computer is unaware (ideally) on which replica it is actually operating.

Assume the user performs several update operations and then disconnects again.

Later the user accesses the database again, possibly after moving to a different location or by using a different access device. The user may be connected to a different replica.

What if the updates have not propagated? Could be confusing to the user.

Client-Consistency Models

Often there are some constraints placed on eventual consistency.

These constraints help define client-consistency models.


Monotonic reads: If a process reads a value of data item x, the

subsequent reads by the same process will return the same value or a later value.

Example• Consider a distributed e-mail database• In such a database, each user’s mailbox may be

distributed and replicated across multiple machines.• Mail can be inserted in a mailbox at any location.• Updates are propagated in a lazy (i.e., on demand)

fashion.• Assume that reads don’t change the mailbox.• Suppose a user reads their e-mail in Vancouver and

then flies to Toronto and reads their e-mail.• A monotonic read guarantees that the messages that

were in the mailbox in Vancouver will also be in the mailbox in Toronto.


Monotonic writes A write operation on data item x is

completed before any subsequent writes by the same process on data item x.

Example: Updating a software library• Update may consist of replacing one or

more functions resulting in a new version.• Updates performed on a copy of the

library should be able to assume that all proceeding updates have been performed first.


Read-Your-Writes A write operation by a process on data item x will

always be seen by a successive read operation on x by the same process

The absence of this consistency is seen in the following examples.

Example: Updating Web HTML pages• Cached web pages are still read even though that

web page has been updated. Example: Password updates for digital library

• This may occur at one site, but not immediately propagated to a site where the account/password is actually needed


Write-Follows-Reads A write operation by a process on data item

x following a previous read operation on x by the same process is guaranteed to see the same or more recent value of x

Implementing Client-Centric Models

Globally unique ID per write operation Assigned by the initiating server Global IDs can be generated locally. A server is required to log the write operation so that it can

be replayed at another server.

For each client, we keep track of two sets of write identifiers: Read set

• Write IDs relevant to client’s read operations Write set

• IDs of writes performed by client

Major performance issue: Size of read/write sets


Monotonic read: When a client issues a read, the server is given the

client’s read set to check whether all the identified writes have taken place locally

• If not, the server contacts others to ensure that it is brought up-to-date

After the read, the client’s read set is updated with the server’s “relevant” writes

Monotonic write: When a client issues a write, the server is given the

client’s write set• … to ensure that all specified writes have been applied (in-order)

The write operation’s ID is appended to client’s write set


Read-your-writes:Before serving a read request, the server

fetches (from other servers) all writes in the client’s write set

Writes-follow-reads:Server is brought up-to-date with the writes in

the client’s read setAfter write, the new ID is added to the client’s

write set, along with the IDs in the read set • … as these have become “relevant” for the write just

performed

Impact of Mobility

Mobility suggests that a user may be disconnected. Assume that a user of a mobile device has downloaded

their calendar from their workstation. User’s device is disconnected. User makes changes to the calendar on the mobile

device. Secretary makes changes to the calendar on the

workstation When the user is connected the calendar on the user’s

device and on the user’s workstation should become the same.

Some schemes have the user’s device by the primary and the workstation be a backup. This suggests that the calendar on the user’s device is

considered the most recent.

Other Important Implementation Issues

Important issues in implementation includes the following: Placement and nature of replicas Distributing updates

Replica Placement

Permanent A process/machine always has a replica.

• Example: Mirroring of a web site

Server-Initiated Processes that can dynamically host a

replica on request of another server. Client-Initiated

Processes that can dynamically host a replica on request of a client.

• Example: Web Caches

Server-Initiated Replicas

Consider a web server placed in Toronto. Under normal situations, the server can handle

incoming requests easily; it is predicted that in a couple of a days there will be sudden burst of requests.

It may be worthwhile to install a number of temporary replicas in region where requests are coming from.


The ability to optimize the dynamic placement of replicas is of special interest to web hosting services. ISPs pay a web hosting company

(sometimes called an access-centric content distribution network) to serve popular content from caches close to the ISPs’ subscribers.

This model assumes that storage is cheaper than bandwidth, and that customers will not hesitate to move to other ISPs if they perceive their current ISP to be slow.


Example Heuristic: Keep track of access counts per file. Number of accesses drops below some

threshold value D. This implies that file can be dropped.

The number of accesses exceeds a threshold R. This implies that the file should be replicated.

Client-Initiated Replicas Created at the initiative of clients. Known as caches In essence, a cache is a local storage facility that is

used by a client to temporarily store a copy of the data it has just requested.

Client caches are used to improve access times to data.

Data is generally kept in a cache for a limited amount of time e.g., to prevent extremely stale data from being used or make room for other data.

Cache placement can be local to a client’s machine or in a location that is easily accessible by other machines in the client’s organization.

Update Propagation

Update operations are generally initiated by a client and subsequently forwarded to one of the copies.

There are a number of design issues to consider.

State or Operation? An important design issue concerns what is

actually to be propagated. Three Possibilities:

• Notification of an update• New copy of data• Copy of operation

Trade bandwidth for processing

Update Propagation

Push vs Pull Another design issue is whether updates

are pulled or pushed. Push by server

• Server must know replicas• Client immediately updated

Pull by client• Client must poll or delay response when item

requested

Update Propagation

Push vs. Pull (cont) Leases

• We can dynamically switch between pulling and pushing using leases: A contract in which the server promises to push updates to the client until the lease expires.

• Age-based leases: An object that hasn’t changed for a long-time, will not change in the near future, so provide a long-lasting lease.

• Renewal-frequency based leases: The more often a client requests a specific object, the longer the expiration time for that client (for that object) will be.

• State-based leases: The more loaded a server is, the shorter the expiration times become.

Consistency Requirements in Applications

We have looked at several consistency models and possible implementations.

There are many more out there that are a variation of the models described.

It is important to understand the consistency requirements of the application domain.

Let’s look at some Internet applications.

Consistency Requirements for Applications

Bulletin board Replicated message posting service As discussed earlier, causal order is needed. Some

bulletin boards may also want total order. There may be a requirement on how fast these

updates should be.

KaZaa Order of updates doesn’t matter since downloading a

file is a commutative operation i.e., it doesn’t matter if song a is downloaded before song b or if song b is downloaded before song a.

Some would say is that what is important is eventually all sites could have the same songs.


Chat Service Chat messages require causal order for discussions

to make sense. Games

Players’ moves in a game must be delivered in the same order to all participants for fairness.

In both these cases, timeliness is important. A centralized solution results in a performance

bottleneck. Games sometimes guess at moves or the

position of objects on the game board E.g., instead of sending and receiving messages for

the position of a object, the software predicts what the positions would be.


Airline reservation This is representative of replicated e-commerce

services that accept inquiries (searches) and purchases orders on a catalog.

A measurement of consistency is used. This is the percentage of requests that access inconsistent results.

Example: A user may observe an available seat when in fact the set has been booked at another replica.

Isn’t this handled by using one of the approaches to providing total order.

Yes, but if a small violation of consistency is tolerated we can achieve better performance.


Airlines reservation (cont) Consistency requirements change

dynamically. Example: The cost of a transaction that

must be rolled back is fairly small when a flight is empty but grows was the flight fills.

• Why? One can likely find an alternate seat on the same flight.

• Requests when the flight is close to full may require a replica to be more aggressive in enforcing sequential consistency.