mobile data access1 replication, caching, prefetching and hoarding for mobile computing
TRANSCRIPT
Mobile Data Access 1
Replication, Caching, Prefetching and Hoarding for Mobile Computing
Mobile Data Access 2
Definitions Replication: To maintain multiple (consistent)
copies of a data item Static replication: the number and location of copies
are statically determined (at compile time, design time).
Dynamic replication: the number and location of copies is determined dynamically (at run-time)
Caching: To maintain a temporary copy of the data in fast (local) memory. The copy is fetched when it is first accessed.
Pre-fetching: To obtain a temporary before it is accessed (to hide access latency).
Hoarding: To preload a copy of a data object so that the mobile client can work while it is disconnected from the network (I.e. prefetching to tolerate disconnections).
Mobile Data Access 3
Data Access Model
On-Demand Broadcast Channel
Mobile Data Access 4
Motivation
Caching (Prefetching/Hoarding) at mobile clients is crucial to improve performance of info access and database querying.
Issues: Read only data: currency guarantees Read/Write data: consistency in presence of disconnected
operations Server Load/Scalability in presence of numerous clients
Mobile Data Access 5
Mobile Database Querying: Requirements
Minimize query delay Maximize number of queries answered
per unit time (system throughput) Handle client disconnection Conserve wireless bandwidth and
battery power Minimize server load Handle mobility
Mobile Data Access 6
Advantages of Caching in Mobile Environment
Helps reduce latency caused by narrow bandwidth wireless links
Enable limited functionality in mobile hosts even in disconnected mode
Helps conserve battery power by reducing the number of uplink queries
Conserves bandwidth
Mobile Data Access 7
Problems in Maintaining Consistent Cache
Classic solutions do not work Mobile Clients may be disconnected for long
duration => invalidations may be lost Upon reconnection mobile clients will have
to revalidate their cache (wastes energy and bandwidth).
Need new solutions
Mobile Data Access 8
Challenges to Efficient Caching Scheme
Efficient caching scheme should take into account: Data access pattern Data update rate Communication/access cost Mobility pattern of the clients Connectivity characteristics
• Disconnection frequency• Available bandwidth
Data currency requirements Location-dependence of information
Mobile Data Access 9
General Issues in Designing Caching Schemes
Where to cache? How many levels of caching to use?
What to cache (when to cache a data item and how long) ?
How to invalidate cached items? Who is responsible for invalidations? What is the granularity at which the invalidations are done?
What data currency guarantees the system can provide to the user? What are the costs involved? How to charge the user?
What is the effect of the caching scheme on the query delay (response time) and the system throughput (query completion rate)?
Mobile Data Access 10
Classification of Cache Invalidation Schemes
Who is in charge of invalidations? Server or Client (Push or Pull): Callbacks or Validation
Checks Whether or not server maintains per client state
information? Stateless or Stateful Server
How server sends invalidation reports? Synchronously or Asynchronously
What kind of information is sent in the invalidation report? State or History based
How information is organized in invalidation reports? Uncompressed or Compressed
Mobile Data Access 11
Cache Maintenance Schemes
Broadcasting Invalidation Reports [Barbara Sigmod 94].
Disconnected Operation in CODA (Satyanarayanan et. al. ) Hoarding (Prefetching)
AS (Asynchronous Stateful) Caching Scheme (Kahol et. al. ICDCS 00)
Mobile Data Access 12
Broadcasting Invalidation Reports
Uses stateless servers and synchronous broadcasts [Barbara Sigmod 94]
Clients maintain local caches and use the information in invalidation reports to update their cache.
A server broadcasts invalidation reports every L time units which contains ids of all the data items which changed during the past w = kL time units.
A query is satisfies after receiving the next invalidation report.
Mobile Data Access 13
Broadcasting IR: Variations
If a client is disconnected from the network and misses k consecutive invalidation reports then it has to discard its cache.
Two variations:1. Timestamp Strategy (TS): invalidation reports
contain ids of modified data items over a large window (k > 1).
2. Amnesic Terminal (AT): invalidation reports contain ids of only those data items which changed since the last broadcast (k=1).
TS is better when clients are “sleepers” and AT is better when clients are “workaholics”.
Mobile Data Access 14
Disconnected Operation in CODA
Goal: COnstant Data Availability Mechanisms: server replication and
disconnected operations. Caching scheme (asynchronous, stateful):
Uses callbacks while a client is reachable from a server.
During disconnections permits access to possibly stale data.
Upon reconnection, the client does validity checks on each volume cached.
Uses hoarding to improve data availability
Mobile Data Access 15
Drawbacks
Drawbacks of Barbara’s scheme: Poor delay characteristics due to waiting
involved before answering a query. Poor network utilization characteristics due
to answering of queries in bursts. Does not support arbitrary disconnection
pattern. Drawbacks of CODA caching scheme:
Server has to keep cache state of each client (affects scalability).
A client has to perform volume-by-volume validation check after each reconnection.
Mobile Data Access 16
AS Caching Scheme (Kahol et al)
Maintains a Home Location Cache (HLC) at home MSS of a mobile client.
A HLC contains the state of the cache at a MH. Uses Asynchronous transfer of invalidation reports. Supports arbitrary disconnection durations by
maintaining the timestamp of the last invalidation report destined for an MH at its HLC.
Mobile Data Access 17
An Example for AS Scheme
Each cache is associated with a cache timestamp which is the timestamp of the last invalidation report received.
A mobile client sends a probe message to its home MSS when it gets connected to determine whether it missed any invalidation reports while it was disconnected.
Mobile Data Access 18
Hoarding
Planned and accidental disconnections are not considered failures.
A technique to reduce the cost of cache misses during disconnection: load necessary data before disconnect and be ready.
Hoarding techniques: user-provided information (client-initiated
disconnection)• explicitly specify which data (files, tables) to
hoard• Implicitly based on the specified application
access structured-based (use past history)E.g., tree-based in file systems, access paths (joins) in databases
Mobile Data Access 19
Hoarding versus Prefetching
Both pre-fetch data in anticipation of future use.
Prefetching Objective is to improve performance (throughput or
response time). Cache miss is not catastrophic.
Hoarding Objective is to fetch all needed data into MU cache
prior to disconnect. Thus the goal is to facilitate disconnected operation.
Cache miss is catastrophic. OK to overfetch
Mobile Data Access 20
Hoarding in Database Systems
Granularity of Hoarding RDBMS: ranges from tables, set of tables, whole
relations OO DBMS: objects, set of objects or class
Hoard by issuing queries or materialized views User may explicit issue hoarding queries
E.g., Create View with Update-On clause [Lauzac 98] OO query to describe hoarding profiles
[Gruber 94] History of past references both queries and data
objects Hoard Keys - an extended database organization
[Badrinath 98]• hoard keys are used to partition a relation in
disjoint logical horizontal fragments
Mobile Data Access 21
References
D. Barbara and T. Imielinski, Sleepers and Workaholics: Caching Strategies in Mobile Environments, VLDB Journal, 4, 567-602, 1995.
A. Kahol, S. Khurana, S.K. S. Gupta, and P. K. Srimani, A Strategy to Manage Cache Consistency in a Disconnected Distributed Environment, IEEE Transactions on Parallel and Distributed Systems, 12(7), 686-700, July 2001