distributed systems it332 - wordpress.com · client-centric consistency models client‐centric...
TRANSCRIPT
Consistency and Replication Distributed Systems IT332
1
Overview
Introduction
Consistency Models
Data‐centric
Client‐centric
Replica management
Consistency protocols
2
Why Replication?
Replication is the process of maintaining the data at multiple computers
Replication is necessary for:
Enhances reliability
If one replica failure (unavailable or crashes), use another Replica allows remote sites to continue working .
Protect against corrupted data.
Improves performance
Replicating services reduces load on individual servers
Placing replicas close to clients reduces communication delay.
This directly supports the distributed systems goal of enhanced scalability.
3
Why Consistency?
Key issue: if there are many replicas of the same thing, how do we keep all of them up-to-date? How do we maintain consistency of replicated data? In a DS with replicated data, one of the main problems is keeping the data consistent
Consistency can be achieved in a number of ways. We will study a number of consistency models, as well as protocols for implementing the models.
An example: In an e-commerce application, the bank database has been replicated across two servers
Maintaining consistency of replicated data is a challenge
4
Bal=1000 Bal=1000
Replicated Database
Event 1 = Add $1000 Event 2 = Add interest of 5%
Bal=2000
1 2
Bal=1050 3 Bal=2050 4 Bal=2100
Overview
Consistency Models
Data-Centric Consistency Models
Client-Centric Consistency Models
Replica Management
Consistency Protocols
5
Distributed Data‐Store
A data store is a service that stores data E.g., databases, file systems, web servers
A data store consists of multiple servers each containing a copy of all the data items stored in the data store
A data-store can be read from or written to by any process in a DS.
A local copy of the data-store (replica) can support “fast reads”.
A client always connects to a single replica Reads are performed locally
Writes are performed locally first and then propagated to remote replicas
Process 1 Process 2 Process 3
Local Copy
Distributed data-store
6
Consistency Model
A data store can be inconsistent in two ways: The data could be stale.
Operations are performed in different orders at different replicas.
A “consistency model” is a CONTRACT between a DS data-store and its processes.
If the processes agree to the rules, the data-store will perform properly and as advertised.
In order for a data store to be consistent all write‐write conflicting operations must be seen in an agreed upon order by the clients.
A consistency model defines which interleaving of operations are acceptable (admissible).
7
Data‐Centric Consistency Models
Apply to the whole data store: any client accessing the data store will see operations ordered according to the model:
Strict consistency
Sequential consistency
Causal consistency
FIFO consistency
8
Strict Consistency
Any read on a data item returns the result of the most recent write on that data item requires all clients have a (regardless of where the write occurred):
notion of an absolute global time
Requires instant propagation of writes to all replicas
9
Behavior of two processes, operating on the same data item: a) A strictly consistent data-store. b) A data-store that is not strictly consistent.
Impossible to implement in a distributed data store.
Sequential Consistency
A weaker consistency model, which represents a relaxation of the rules.
It is also much easier (possible) to implement.
All clients see all (write) operations performed in the same order:
Assumes all operations are executed in some sequential order
Program order is maintained (i.e., the order of writes as performed by a single process must be maintained)
All processes see the same ordering of operations
10
(a) A sequentially consistent data store- the “first” write occurred after the “second” on all replicas.
(b) A data store that is not sequentially consistent- it appears the writes have occurred in a non-sequential order, and this is NOT allowed.
Sequential Consistency
With this consistency model, adjusting the protocol to favor reads over writes (or vice-versa) can have a devastating impact on performance (refer to the textbook for the gory details).
For this reason, other weaker consistency models have been proposed and developed.
Again, a relaxation of the rules allows for these weaker models to make sense.
11
Causal Consistency
Weaker than sequential consistency
Two operations are causally related if: A read is followed by a write in the same client
A write of a data item is followed by a read of that data item in any client
Writes that are potentially causally related must be seen by all processes in the same order.
Concurrent writes may be seen in a different order on different machines (i.e., by different processes as long as program order is respected).
12
(a) Not Permitted (b) Permitted
Causal Consistency
a) Violation of causal-consistency – P2’s write is related to P1’s write due to the read on ‘x’ giving ‘a’ (all processes must see them in the same order).
b) A causally-consistent data-store: the read has been removed, so the 2 writes are now concurrent. The reads by P3 and P4 are now OK.
13
Exercise: Is it Sequentially and Casually consistent?
14
FIFO Consistency
This is also called “PRAM Consistency” – Pipelined RAM.
Program order must be respected
Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes.
The attractive characteristic of FIFO is that is it easy to implement. There are no guarantees about the order in which different processes see writes – except that two or more writes from a single process must be seen in order ((even if causally related)
15
FIFO Consistency
a) A valid sequence of FIFO consistency events.
b) Not FIFO consistent
16
(a) (b)
Summary of Data‐Centric Consistency Models
Consistency Description
Strict Absolute time ordering of all shared accesses matters
Sequentially All processes see all shared accesses in the same order. Accesses are not ordered in time.
Causal All processes see causally‐related shared accesses in the same order.
FIFO All processes see writes from a specific process in the order they were performed. Writes from different processes may not always be seen in that order.
17
These consistency models that do not use synchronization operations. There are other models that do use synchronization operations.
Require additional programming constructs, Allow programmers to treat the data-store as if it is sequentially consistent,
when in fact it is not. They “should” also offer the best performance).
Client-centric Consistency Models
• The previously studied consistency models concern themselves with maintaining a consistent (globally accessible) data-store in the presence of concurrent read/write operations
• Another class of distributed data-store is that which is characterized by the lack of simultaneous updates. Here, the emphasis is more on maintaining a consistent view of things for the individual client process that is currently operating on the data-store.
18
Client-centric Consistency Models
Many systems have one or few updaters and many readers: No write‐write conflicts
How fast should updates (writes) be made available to read-only processes?
Examples: Think of most database systems: mainly read.
DNS: single naming authority per domain
Only naming authority is allowed to update its part of the name space: write-write conflicts do no occur.
Web:
Web pages are updated by a single authority such as a webmaster or owner of the page,
As with DNS, except that heavy use of client-side caching is present: even the return of stale pages is acceptable to most users.
These systems all exhibit a high degree of acceptable inconsistency … with the replicas gradually becoming consistent over time.
19
Eventual Consistency
Eventual consistency: if no updates take place for a long time, all replicas converge towards identical copiesThe only requirement is that all replicas will eventually be the same.
All updates must be guaranteed to propagate to all replicas … eventually!
This works well if every client always updates the same replica.
Things are a little difficult if the clients are mobile.
Benefits:
Cheap to implement
Things work fine as long as clients always access the same replica.
What if they don’t: introduce client‐centric consistency
20
Eventual Consistency: Mobile Problems
21
The principle of a mobile user accessing different replicas of a distributed database.
When the system can guarantee that a single client sees accesses to the data-store in a consistent way, we then say that “client-centric consistency” holds.
Client-centric Consistency Models
Client‐centric consistency provides guarantees for a single client concerning the consistency of accesses to a data store by that client
No guarantees are given concerning concurrent accesses by different clients
Four client‐centric consistency models
Monotonic reads
Monotonic writes
Read your writes
Writes follow reads
22
Monotonic Reads
If a client reads the value of a data item x, any successive read operation on x by that client will always return that same value or a more recent value
Some notations
xi denotes the version of data item x at replica i
Version xi is the result of a series of write operations at replica i that took place since initialization; this set is denoted as WS(xi)
WS(xi; xj) denotes that operations in WS(xi) have been performed at replica j
23
Monotonic Reads
24
a) A monotonic-read consistent data store b) A data store that does not provide monotonic reads.
Example: The read operations performed by a single process P at two different local copies of the same data store.
(a)
Monotonic Writes
Example: The write operations performed by a single process P at two different local copies of the same data store
A write operation by a client on a data item x is completed before any successive write operation on x by the same client This is essentially a client‐centric version of FIFO consistency (writes
from the same client are sequentially ordered)
25
a) A monotonic‐write consistent data store b) A data store that does not provide monotonic-write consistency.
(a)
Read Your Writes
The effect of a write operation by a client on data item x will always be seen by a successive read operation on x by the same client
26
a) A data store that provides read-your-writes consistency.
b) A data store that does not.
Writes Follow Reads
A write operation by a client on a data item x following a previous read operation on x by the same client is guaranteed to take place on the same or amore recent value of x that was read
27
a)A writes-follow-reads consistent data store.
b) A data store that does not provide writes-follow-reads consistency
Overview
Consistency Models
Replica Management
Consistency Protocols
28
Replica Management
Replica management describes where, when and by whom replicas should be placed
We will study two problems under replica management
Replica Placement
Update Propagation
29
Replica Placement
Permanent replicas
are created by the data store owner and function as permanent storage for the data.
tend to be small in number (Often is a single server), organized as COWs (Clusters of Workstations) or mirrored systems (cluster or group of mirrors)
Server‐initiated replicas
are replicas created in order to enhance the performance of the system at the initiation of the owner of the data-store.
placed on servers maintained by others.
30
Replica Placement
Placed close to large concentrations of clients.
Typically used by web hosting companies to geographically locate replicas close to where they are needed most. (Often referred to as “push caches”).
Client‐initiated replicas created as a result of client requests
are temporary copies created by clients to improve their access to the data (client caches)
Examples: Web browser caches and proxy caches
Works well assuming, of course, that the cached data does not go stale too soon.
31
Dynamic Replication
The decisions about where to place replicas and when to create new ones/destroy existing ones/migrate existing ones are made automatically by the system
Requirements
A network of servers willing to host replicas
Collection of usage pattern data
Migration of replicas to and from other servers
32
Dynamic Replication Example
Each server keeps track of access counts per file, aggregated by considering server closest to the requesting clients
All access requests for F at Q from C1 and C2 are registered at Q as a single access count cntQ(P, F)
Number of accesses for F at Q < threshold D remove F from Q
Number of accesses for F at Q > threshold R replicate F on another server
D ≤ number of accesses for F at Q ≤ R migrate F to P if cntQ(P, F) exceeds half of total requests for F at Q
33
Update Propagation
When a client initiates an update to a distributed data-store, what gets propagated?
1. Propagate notification (invalidation protocol) of the update to the other replicas :
indicates that the replica’s data is no longer up-to-date.
best when read‐to‐write ratio is small (there’s many writes).
Uses little network bandwidth.
2. Transfer the data from one replica to another:
Works well when there’s many reads.
Useful when read‐to‐write ratio is high
3. Propagate the update to the other replicas :
this is “active replication”, and shifts the workload to each of the replicas upon an “initial write”.
Updates can be propagated at minimal bandwidth costs
More processing power required by each replica
34
Pull vs. Push
Another design issue relates to whether or not the updates are pushed or pulled?
Push‐based approach/ Server-based approach: Server‐initiated, updates are pushed “automatically” to all replicas when
they occur. the client does not request the update.
Useful when a high degree of freshness and consistency is required, and when the read‐to‐update ratio is high.
Often used between permanent and server-initiated replicas.
Pull‐based approach/ Client –based approach : No request, no update! used by client caches (e.g., browsers), updates are requested by the client
from the server.
Updates are pulled from replicas when they are needed.
Efficient when read‐to‐update ratio is low.
35
Pull vs. Push
36
A comparison between push‐based and pull‐based approaches in the case of multiple‐ client, single‐server systems
Overview
Consistency Models
Replica Management
Consistency Protocols
37
Consistency Protocols
Two techniques to implement sequential consistency:
Primary‐based protocols: each data item has a primary replica on which all writes are performed
Remote‐write: writes are possibly executed on a remote replica
Local‐write: writes are always executed on the local replica
Replicated‐write protocols: writes are performed on multiple replicas simultaneously
Active replication
38
Remote-Write Protocol
• With this protocol, all writes are performed at a single (remote) server.
• This model is typically associated with traditional client/server systems.
A variation is known as primary‐backup protocol:
Allow local reads, send writes to primary
Block on write until all backups have updated their local copy
Nonblocking approach: primary returns an ACK as soon as it has updated its local copy -> speeds up writes
39
Remote-Write Protocol
40
Remote-Write Protocol: A variation
41
Writes are still centralised, but reads are now distributed. The primary coordinates writes to each of the backups.
Remote-Write Protocol
42
R3 R1 R2
Primary server
x+=5
Client 1
x1=0 x2=0 x3=0 x2=5 x1=5 x3=5
Data-store
Bad: Performance! All of those writes can take a long time
when using a blocking write protocol. Using a non-blocking write protocol to handle the updates can lead to fault tolerant problems (which is our next topic).
Good: as the primary is in control, all writes can be sent to each backup replica IN THE SAME ORDER, making it easy to implement sequential consistency.
Local‐Write Protocol
In this protocol, a single copy of the data item is still maintained.
Upon a write, the data item gets transferred to the replica that is writing.
Multiple successive writes can be carried out locally
That is, the status of primary for a data item is transferable.
This is also called a “fully migrating approach”.
Good for mobile clients operating in disconnected mode
The client becomes the primary before disconnecting from network, can perform updates locally
The client updates all the backups when reconnecting to the network
43
Local‐Write Protocol
44
Primary-based local-write protocol in which a single copy is migrated between processes (prior to the read/write).
Local‐Write Protocol
The big question to be answered by any process about to read from or write to the data item is:
“Where is the data item right now?”
It is possible to use some of the dynamic naming technologies studied earlier in this course, but scaling quickly becomes an issue.
Processes can spend more time actually locating a data item than using it!
45
Local‐Write Protocol: A variation
46
Primary-backup protocol in which the primary migrates to the process wanting to perform an update, then updates the backups. Consequently, reads are much more efficient.
Active Replication
A type of Replicated-Write Protocols/Ditributed –Write protocols where writes can be carried out at any replica.
Writes are sent to all replicas, reads are performed locally.
Writes must be carried out in the same order everywhere. This requires atomic multicast or a centralized sequencer.
Centralized sequencer approach: Each write is forwarded to the sequencer
Sequencer assigns a unique Seq. No. to the write and forwards the write to all replicas
Each replica carries out the writes in the order of Seq. No.
47
R3 R1
Client 1
Data-store
x+=5
R2
Client 2
x-=2
Seq
10
10 x+=5 x-=2
11
11
Next Chapter
Fault Tolerance
How to detect and deal with failures in Distributed Systems?
Questions?
48