concurrency control & caching consistency issues and survey dingshan he november 18, 2002
Post on 21-Dec-2015
219 views
TRANSCRIPT
Concurrency Control & Caching Consistency Issues and Survey
Dingshan He
November 18, 2002.
Outline
• Infrastructure assumptions• Concurrency control & caching consistency issues
in the infrastructure• Survey concurrency control & caching consistenc
y solutions in existing systems– Storage Tank– Oceanstore– Coda
• Discussion
Infrastructure Assumptions
• Entities– Clients– OSD’s– Regional Managers
• Mobility– Clients could have high mobility– OSD’s have mediate mobility– Regional Managers are relatively static
Infrastructure Assumptions (cont.)
• Connectivity– Disconnection is possible at any place– Clients could have weak connectivity (low
bandwidth/long latency)
• Any of the three kinds of entities can be dynamically created and inserted into the infrastructure
Client Behavior
• Caching information for performance as well as in expectation of disconnection
• High mobility– Transfer of regional managers– Changing of concurrency control & caching
consistency information
• Weak connectivity– Reduce message volume
OSD Behavior
• Mobility– Transfer of regional managers– Handing over of concurrency control & caching
consistency management task– Redirecting requests to new regional managers
Regional Manager Behavior
• Support transferring of clients and OSD’s
• Efficiently performs handing over
• Disconnection:– Regional managers get partitioned– Maintain strong consistency within connected
partitions– Maintain enough information for reintegration
Exploiting Object Features
• No single solution could satisfy all situations
• Object should have its own requirement
• Our design should identify these requirements, abstract them into several levels, and applies corresponding mechanism accordingly
Survey of Several Existing Systems
• IBM Storage Tank
• Oceanstore
• Coda
Storage Tank with OSD
IBM Storage Tank Protocol
• A locking and data consistency model
• Allows the IBM Storage Tank distributed storage system to look and behave like a local file system
• Objective: provides strong data consistency between clients and servers in distributed environment
Storage Tank Features
• Concurrency control– Semi-preemptive session locks– Byte-range locks (mandatory and advisory)– Cache coherency data locks
• Sequential consistency• Direct I/O for caching applications (database)• Publish consistency for web updates
• Aggressive caching– Write-back caching of data and metadata– Session state via semi-preemptive locks
Storage Tank Features (cont.)
• Data consistency across client failures– Leases for failure detection and coordinated
recovery• Implicit leases
• Opportunistic renewal
Storage Tank Client Cache
• Data
• Metadata
• Locks
Comments on Storage Tank
• designed to provide performance that is comparable to that of file systems built on bus-attached, high-performance storage.
• Works in data center model
• Restricted to enterprise-wide data sharing, physically
Oceanstore’s Update model
• An update is a list of predicate-action pairs• If some predicate = true, the update commits• Each update is applied atomically• Can perform many useful predicates and actions
against encrypted data– Search over encrypted data
– Delete and append using a position-dependent block cipher
Oceanstore Consistency Solution
• User a two tier architecture– Primary tier: uses distributed consistency
• Replicas use Byzantine agreement protocol• Replicas sign decisions using proactive signatures
– Secondary tier: acts as a distributed read/write cache
• Kept up-to-date via “push” or “pull”
• Supports connected and disconnected modes of operation
Oceanstore Update Serialization
• Clients optimistically timestamp updates with commit times
• Secondary replicas order updates by timestamps tentatively
• Primary tier picks total order guided by timestamps using Byzantine agreement protocol
Comments on Oceanstore
• Similar infrastructure
• It does not separate metadata and data
Coda Volume Management
• Volume Storage Group (VSG) : set of servers with replicas of a volume
• Degree of replication and identity of replication site are specified when a volume is created
• Above info. is stored in volume replication database presenting at every server
• Venus keeps track of Available VSG (AVSG) for every volume from which it has cached data
Coda Read/Write Strategy
• Client obtains data from one member of its AVSG called the preferred server
• Other servers are contacted to verify that the preferred server does have latest copy
• When a file is closed after modification it is transferred in parallel to all members of the AVSG
Coda’s Disconnected Operation
• Aim: to provide a file system with resilience to network failures
• Venus performs as a pseudo-server• Updates have to be revalidated with respect
to integrity and protection by real servers• Venus records sufficient information to
replay update activity in a per-volume log called replay log
Coda’s Reintegration
• The replay log is shipped in parallel to the AVSG, and executed independently at each member
• Each replica of an object is tagged with a storeid• Storeid of objects mentioned in replay log vs. stor
eid of server’s replica of the object• 1) Lock referenced objects, 2) validate and execut
e each operation, 3) data transfer, and 4) commits transaction and release locks
Comments on Coda
• Designed for specific application, campus environment particularly
• Optimistic replica control– Conflicting updates– Security of cached replica