web cache replacements

79
Web Cache Replacements 張張張 張張張張張 張張張張 [email protected]

Upload: juro

Post on 04-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Web Cache Replacements. 張燕光 資訊工程系 成功大學 ykchang@mail . ncku.edu.tw. Introduction. Which page to be removed from its cache? Finding a replacement algorithm that can yield high hit rate . Differences from traditional caching nonhomogeneity of the object sizes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Web  Cache Replacements

Web Cache Replacements

張燕光 資訊工程系成功大學

[email protected]

Page 2: Web  Cache Replacements

2

Introduction• Which page to be removed from its cache?

– Finding a replacement algorithm that can yield high hit rate.

• Differences from traditional caching– nonhomogeneity of the object sizes– same frequency and different size, favor

smaller objects if consider only hit rate,• Byte hit rate

Page 3: Web  Cache Replacements

3

Introduction• Other consideration

– transfer time cost– Expiration time– Frequency

• Measurement metrics?

• admission control?

• When or how often to perform the replacement operations?

• How many documents to remove?

Page 4: Web  Cache Replacements

4

Measurement Metrics• Hit Rate (HR):

– % requests satisfied by cache– (shows fraction of requests not sent to server)

• Volume measures:– Weighted hit rate (WHR):

• % client-requested bytes returned by proxy (shows fraction of bytes not sent by server)

– Fraction of packets not sent– Reduction in distance traveled (e.g., hop

count)

• Latency Time

Page 5: Web  Cache Replacements

5

Three Categories• Traditional replacement policies and its

direct extensions:– LRU, LFU, …

• Key-based replacement policies:

• Cost-based replacement policies:

Page 6: Web  Cache Replacements

6

Traditional replacement• Least Recently Used (LRU) evicts the

object which was requested the least recently– prune off as many of the least recently used

objects as is necessary to have sufficient space for the newly accessed object.

– This may involve zero, one, or many replacements.

Page 7: Web  Cache Replacements

7

Traditional replacement• Lease Frequently used (LFU) evicts the

object which is accessed least frequently.

• Pitkow/Recker [78] evicts objects in LRU order, except if all objects are accessed within the same day, in which case the largest one is removed.

Page 8: Web  Cache Replacements

8

Key-based Replacement• The idea in key-based policies is to sort

objects based upon a primary key, break ties based on a secondary key, break remaining ties based on a tertiary key, and so on.

Page 9: Web  Cache Replacements

9

Key-based Replacement• LRUMIN:

– This policy is biased in favor of smaller sized objects so as to minimize the number of objects replaced.

– Let the size of the incoming object be S. Suppose that this object will not fit in the cache.

– If there are any objects in the cache which have size at least S, we remove the least recently used such object from the cache.

– If there are no objects with size at least S, then we start removing objects in LRU order of size at least S/2, then objects of size at least S/4, and so on until enough free cache space has been created.

Page 10: Web  Cache Replacements

10

Key-based Replacement• SIZE policy:

– In this policy, the objects are removed in order of size, with the largest object removed first.

– Ties based on size are somewhat rare, but when they occur they are broken by considering the time since last access. Specifically, objects with higher time since last access are removed first.

Page 11: Web  Cache Replacements

11

Key-based Replacement• LRU-Threshold [2] is the same as LRU, but

objects larger than a certain threshold size are never cached.

• Hyper-G [78] is a refinement of LFU, break ties using the recency of last use and size.

• Lowest Latency First [77] minimizes average latency by evicting the document with the lowest download latency first.

Page 12: Web  Cache Replacements

12

Cost-based Replacement• employ a potential cost function derived

from different factors such as time since last access, entry time of the object in the cache, transfer time cost, object expiration time and so on.)– GreedyDual-Size (GD-Size) associates a cost

with each object and evicts object with the lowest cost/size.

– Hybrid [77] associates a utility function with each object and evicts the one has the least utility to reduce the total latency.

Page 13: Web  Cache Replacements

13

Cost-based Replacement• Lowest Relative Value evicts the object with

the lowest utility value.

• Least Normalized Cost Replacement (LCN-R) [70] employs a rational function of the access frequency, the transfer time cost and the size.

• Bolot/Hoschka [10] employs a weighted rational function of transfer time cost, the size, and the time last access.

Page 14: Web  Cache Replacements

14

Cost-based Replacement• Size-Adjusted LRU (SLRU) orders the object by

ratio of cost to size and choose objects with the best cost-to- size ratio.

• Server-assisted scheme models the value of caching an object in terms of its fetching cost, size, next request time, and cache prices during the time period between requests. It evicts the object of the least value.

• Hierarchical GreedyDual (Hierarchical GD) does object placement and replacement cooperatively in a hierarchy.

Page 15: Web  Cache Replacements

15

GreedyDual-Size• GreedyDual is originally proposed by Young and

Tarjan, concerned with the case when pages in a cache have the same size but incur different costs to fetch from a secondary storage

• A value HH is initiated with each cached page p when a page is brought into cache.– H is set to be the cost of bringing the page into the

cache – the cost is always nonnegative.

• (1) Page with the lowest H value (minH) is replaced and (2) then all pages reduce their H values by minH

Page 16: Web  Cache Replacements

16

GreedyDual-Size• If a page is accessed its H value is restored

to the cost of bringing it into the cache • Thus the H values of recently accessed

pages retain a larger portion of the original cost than those of pages that have not been accessed for a long time

• By reducing the H values as time goes on and restoring them upon access the algorithm integrates the locality and cost concerns in a seamless fashion

Page 17: Web  Cache Replacements

17

GreedyDual-Size• setting H to cost/size upon accesses to a

document where cost is the cost of bringing the document and size is the size of the document in bytes– call this extended version as GreedyDualSize

• The definition of cost depends on the goal of the replacement algorithm costcost is set to – 11 if the goal is to maximize hit ratio – the downloading latencythe downloading latency if the goal is to

minimize average latency – network costnetwork cost if the goal is to minimize the total

cost

Page 18: Web  Cache Replacements

18

GreedyDual-Size• Implementation:

– Need to decrement all the pages in cache by Min(q) every time a page is replaced, which may be very inefficient

– Improved algorithm is in the next page– Maintaining a priority queue based on H– Handling a hit requires O(log k) time and – Handling an eviction requires O(log k) time

since in both cases the queue needs update

Page 19: Web  Cache Replacements

19

GreedyDualSizeAlgorithm GreedyDual (document p)

/* Initialize L 0 */

(1) If p is already in memory,

(2) H(p) L + c(p)/size(p)

(3) If p is not in memory,

(4) while there is not enough room in memory for p,

(5) Let L min H(q) for all q in cache

(6) Evict q such that H(q) = L

(7) Put p into memory & set H(p) L + c(p)/size(p)

Page 20: Web  Cache Replacements

20

Hybrid Algorithm (HYB)

• Motivated by Bolot and Hoschka's algorithm. • HYB is a hybrid of several factors, considering not

only download time but also number of references to a document and document size. HYB selects for replacement the document i with the lowest value value of the following expression:

• (clatser(i) + WB // cbwser(i))(nrefi**WN)/ si,– nrefi: # of references to document i since it last entered

the cache, – si: the size in bytes of document i, and – WB and WN: constants that set the relative importance of

the variables cbwser(i) and nrefi, respectively.

Page 21: Web  Cache Replacements

21

Hybrid• Utility function is defined as follows

– Cs is the estimated time to connect to the server

– bs is the estimated bandwidth to the server

– Zp is the size of the document

– np is the number of times the document has been referenced and

– Wb and Wn are constants that that set the relative importance of the variables bsand np, respectively

Wn (np)

Zp

Cs Wb bs

+( )

Page 22: Web  Cache Replacements

22

Latency Estimation Algo. (LAT) [REF]

• Motivated by estimating the time required to download a document, and then replace the document with the smallest download time.

• Apply some function to combine (e.g., smooth) these time samples to form an estimate of how long it will take to download the document– keeping a per-document estimate is probably not practical.– Alternative: keep statistics of past downloads on a per-server

basis, rather than a per-document basis. (less storage)

• For each server j, the proxy maintains an – ClatClatjj: estimated latency (time) to open connection to server– CbwCbwjj: estimated bandwidth of the connection (in

bytes/second),

Page 23: Web  Cache Replacements

23

Latency Estimation Algo. (LAT) [REF]

– When a new document is received from server, the connection establishment latency (sclat) and bandwidth for that document (scbw) are measured , the estimates are updated as follows:

clatj = (1-ALPHA) clatj + ALPHA sclat

cbwj = (1-ALPHA) cbwj + ALPHA scbw

– ALPHA is a smoothing constant, set to 1/8 as it is in the TCP smoothed estimation of RTT

– Let ser(i) denote the server on which document i resides, and si denote the document's size. Cache replacement algorithm LAT selects for replacement the document i with the smallest download time estimate, denoted di:

– di = clatser(i) + si/cbwser(i)

Page 24: Web  Cache Replacements

24

Latency Estimation Algo. (LAT)

• One detail remains: – a proxy runs at the application layer of a network protocol

stack, and therefore would not be able to obtain the connection latency samples sclat.

– Therefore the following heuristic is used to estimate connection latency. A constant CONN is chosen (e.g., 2Kbytes). Every document that the proxy receives whose size is less than CONN is used as an estimate of connection latency sclat.

– Every document whose size exceeds CONN is used as a bandwidth sample as follows:

scbw = download time of document - current value of clatj.

Page 25: Web  Cache Replacements

25

Lowest Relative Value (LRV)• time from the last access tt : for its large

influence on the probability of a new access – the probability of a new access conditioned

to the time from the last access can be expressed as (1 - D(t))

• # of previous accesses i: this parameter allows the proxy to select a relatively small number of documents with a much higher probability of being accessed again

• document size ss: This seems to be the most effective parameter that make a selection among documents with only one access

Page 26: Web  Cache Replacements

26

Lowest Relative Value (LRV)• We compute Pr(I, t, s) as follows

Pr(I, t, s) = P1(s)(1 - D(t)) if i = 1

Pr(I, t, s) = Pi (1 – D(t)) otherwise

– Pi: conditional probability that a document is reference i+1 times given that it has been accessed i times

– P1(s): Percentage of size s with at least 2 accesses

– D(t): density distribution of times between consecutive requests to the same document, derived as

– D(t) = 0.035log(t+1) + 0.45(1 - e )2E6

t

Page 27: Web  Cache Replacements

27

Performance from Pei Cao• Use hit ratio, byte hit ratio, reduced latency and

reduced hops – reduced latency = the sum of downloading latency

for the pages that hit in cache as a percentage of the sum of all downloading latencies

– reduced hops = the sum of the network costs for the pages that hit in cache as a percentage of the sum of the network costs of all Web pages

• model network cost of each document as hops– Web server has hop value: 1 or 32; we assign 1/8 of

servers with hop value 32 and 7/8 with hop value 1– The hop value can be thought of either as the

number of network hops traveled by a document or as the monetary cost associated with the document

Page 28: Web  Cache Replacements

28

Performance from Pei Cao• GD-Size(1) sets cost of each document to be

1, thus trying to maximize hit ratio • GD-Size(packets) sets the cost for each

document to 2+size/536, i.e. estimated number of network packets sent and received if a miss to the document happens– 1 packet for the request, 1 packet for the reply and

size/536 for extra data packets assuming a 536-byte TCP segment size.

– It tries to maximize both hit ratio and byte hit ratio

• Finally GD-Size(hops) sets the cost for each document to the hop value of the document trying to minimize network costs

Page 29: Web  Cache Replacements

29

Performance from Pei Cao• See Cao’s paper: page 4

Page 30: Web  Cache Replacements

30

Lowest Relative Value (LRV)

Page 31: Web  Cache Replacements

31

Lowest Relative Value (LRV)

Page 32: Web  Cache Replacements

32

Lowest Relative Value (LRV)

percentage of wrong choices in discarding documents vs # of accesses issued to the document at the moment of the choice

the cache size is 500Mb

Page 33: Web  Cache Replacements

33

Lowest Relative Value (LRV)

cumulative number of wrong choices in discarding documents vs # of accesses issued to the document at the moment of the choice

the cache size is 500Mb

Page 34: Web  Cache Replacements

34

Bolot/Hoschka’s algorithm’96• Consider following variables:

– ttii, time since the document was last referenced – SSii, the size of the document – rttrttii, the time it took to retrieve the document– ttlttlii, the time to live of the document (i.e. the expected

time until the document will be updated at the remote site, which is also the time interval until the cached document becomes stale).

– Assign a weight to each cached document i Wi = W(ti, Si, rtti, ttli).

• W(ti, Si, rtti, ttli) = 1/ti, documents are replaced according to the time of last reference. This models the LRU algorithm.

• With W(ti, Si, rtt i, ttl i) = Si, documents are cached on the basis of size only

Page 35: Web  Cache Replacements

35

Bolot and Hoschka's algorithm• Proposed Weight function: • W(ti, Si, rtti, ttli) = (w1rtti+w2Si)/ttli + (w3 +w4)/ti • where w1, w2, w3 and w4 have constant value. • The second term on the right-hand side captures the

temporal locality. • The first term captures the cost associated with

retrieving documents (waiting cost, storage cost in the cache), while the multiplying factor 1/ttli indicates that the cost associated with retrieving a document increases as the useful lifetime of the document decreases.

• ttli is the expiration time provided by web servers

Page 36: Web  Cache Replacements

36

Bolot and Hoschka's algorithm• There remains to define parameters wi.

– This goal might be to maximize the hit ratio, or to minimize the perceived retrieval time for a random user, or to minimize the cache size for a given hit ratio, etc.

– expressed as a standard optimization problem, solved using variants of the Lagrange multiplier technique.

• Authors uses the following algorithms– Algo 1: W(ti, Si, rtti, ttli) = w3/ti

– Algo 2: W(ti, Si, rtti, ttli) = w1rtti+w2Si+(w3+w4 Si)/ti

– We express W(ti, Si, rtti, ttli) in terms of bytes, and we take in all cases w1=5000 b/s, w2=1000, w3=10000 bs, and w4=10 s.

Page 37: Web  Cache Replacements

37

Key-based Replacement (P.4)

Page 38: Web  Cache Replacements

38

Key-based Replacement• Removal policies is a taxonomy defined

in terms of a sorting procedure. Two phases:– First, it sorts documents in the cache

according to one or more keys (e.g., primary key, secondary key, etc.).

– Then it removes zero or more documents from the head of the sorted list until a criteria is satisfied.

Page 39: Web  Cache Replacements

39

Pitkow/Reker’s Memory Model• Human memory has a long tradition of research in

the psychology literature (Ebbinghaus, 1885/1964). – One focus of this research is on the relationship of the

time delay between when an item is presented and subsequent performance on recall.

– A related focus is on the number of practice trials for items and subsequent performance on recall.

• As might be expected, the results show that shorter delays and higher amounts of practice lead to better recall performance.

Page 40: Web  Cache Replacements

40

Pitkow/Reker’s Memory Model• Anderson/Schooler1991 argue that the relationship

between the time when an item is first presented and subsequent performance (retention) is a power function. Therefore, under a logarithmic transform, a linear relationship is found between the time and performance measures.

• They argue that the relationship between the number of practice trials and performance is a power function.

• In order to determine how past usage of information predicts future usage, they developed an algorithm for computing and estimating the occurrences of human originated environmental events based upon event frequencyfrequency and recencyrecency rates.

Page 41: Web  Cache Replacements

41

Pitkow/Reker’s Memory Model• Pitkow used the above algorithm to determine

the relationship between # of document requests during a period (called the window) and the probability of access on a subsequent day (called the pane).

Page 42: Web  Cache Replacements

42

Frequency• During Window 1 (day 1 to 7), if there are A and B

accessed 6 times and on Pane 1 (day 8) that A is accessed but B is not. Therefore, for the first window and pane, the probability of access for the frequency of value 6 is the sum of accesses in the pane (1+0) divided by the number of accesses in the window (1+1), or .50.

• in Window 2 (day 2 to 8), if C and D are accessed 6 times and on Pane 2 (day 9) that neither document is accessed. Our new probability of access for the frequency of value 6 is the sum of accesses in the two panes (1+0+0+0) divided by the number of accesses in the two windows (1+1+1+1), or .25.

Page 43: Web  Cache Replacements

43

Recency• In this case, we are looking at the probability

of document access on the eighth day (the pane) based on how many days have elapsed since the document was last requested in the window (still 7 days).

• The recency probabilities are computed in the same fashion as the frequency probabilities, with recency values begin used instead of frequency values,

Page 44: Web  Cache Replacements

44

Pitkow/Reker’s workload• log file is www accesses to Georgia Tech during a

three month period, 1/1 to 3/31, 1994. – containe more than 2000 multimedia documents. – all accesses made by Georgia Tech machines are

removed. These accesses may not accurately represent the average user to the data because they often represent users testing new documents or default document accesses made by client programs.

– The trimmed log is 35 MB, • mean length = 100 bytes and totaling roughly 305,000 requests. • The number of requests ranged from 300 to 12,000 document

per day, with a mean of 3379 accesses per day over the three month period.

• Some individual documents in the database were accessed over 4000 times per week.

Page 45: Web  Cache Replacements

45

Pitkow/Reker’s ResultsOverall, recency proved to be a stronger predictor than frequency.

Page 46: Web  Cache Replacements

46

Pitkow/Reker’s Results

# of doc that have a recency of n=1..7 days

Combination of figures 2 and 3.

i.e., # of docs that are kept in the cache with recency of n =1..7 days

Page 47: Web  Cache Replacements

47

Pitkow/Reker’s ResultsLet # of pages kept in the cache with a recency of n days be K. The hit rate is the percentage of requests on a target day that hit on these K pages.

Figure 5: The cumulative hit rate of documents in the cache as a function of how long it has been since the document was last accessed in the previous 7 days.

Page 48: Web  Cache Replacements

48

Pitkow/Reker’s Replacementif (cache is full) or (end of day) then day = window size while (cache is above comfort level) and (day

> 1) do remove files with recency == day day = day-1 end while if day == 1 then while (cache is above comfort level) do remove largest file end while end ifend if

If all the cached pages are within one day, i.e., with a recency of one day

Page 49: Web  Cache Replacements

49

Williams’s Paper• Undergrad (U):

– About 30 workstations in an undergraduate CS lab from April to October 1995 (190 days), containing 173,384 valid accesses requiring transmission of 2.19GB of static web documents and is representative of a group of clients working in close confines (within speaking distance).

• Classroom (C): – 26 workstations in a classroom containing 30,316 valid

accesses requiring transmission of 405.7MB of static documents.

– tend to make requests when asked to do so by an instructor.

– However results for workloads BR, BL, and G are upper bounds for what real proxies would experience, because a real proxy would probably not cache requests from clients in .cs.vt.edu to servers in .cs.vt.edu.

– workload BR is representative of a cache that is positioned at the point of connection of the Virginia Tech campus to the Internet. Such a cache is useful because it avoids consuming bandwidth

Page 50: Web  Cache Replacements

50

Williams’s Paper• Graduate (G):

– at least 25 users, containing 46,834 valid accesses requiring transmission of 610.92MB of static web pages for most of the spring 1995 semester.

– representative of clients in one department dispersed throughout a building in separate or in common work areas.

• Remote Client Backbone Accesses (BR): – Every URL request appearing on the Ethernet backbone of

domain .cs.vt.edu with a client outside that domain naming a Web server inside that domain for a 38 day period in September and October 1995, representing 180,132 requests requiring transmission of 9.61GB of static Web pages.

– This workload may be representative of a few servers on one large departmental LAN serving documents to world-wide clients.

Page 51: Web  Cache Replacements

51

Williams’s Paper• Local Client Backbone Accesses (BL):

– Every URL request appearing on the Computer Science Department backbone with a client from in the department, naming any server in the world, for a 37 day period in September and October 1995, representing 53,881 accesses requiring transmission of 644.55MB of static Web pages. The requests are for servers both within and outside the .cs.vt.edu domain.

Page 52: Web  Cache Replacements

52

Workload Summary (Paper)Workload

Days Accesses Size(Gb) %Refs %Bytes

U 185 188,674 2.26 graphics graphics

C 95 13,127 0.15 text graphics

G 78 45,400 0.56 graphics graphics

BR 37 227,210 9.38 graphics audio

BL 37 91,188 0.64 graphics graphics

Page 53: Web  Cache Replacements

53

Experiment Overview• Trace-driven simulation• Compare removal policies, viewed as

sorting problems• Answer:

– 1. Maximum theoretical HR, WHR– 2. Best replacement policy– 3. Effectiveness of second level cache– 4. Effectiveness of partitioning cache by

media type (Question raised by Kwan, McGrath, Reed, Nov. 95, IEEE Computer)

Page 54: Web  Cache Replacements

54

Simulation Assumptions• Valid Access:

– a legal request– document "passes" the cache (Simulate only

requests with HTTP return code 200.)

• Definition of hit:– In reality, a "hit" is either

• proxy has doc, and doc estimated consistent• proxy has doc, doc estimated inconsistent, and

CONDITIONAL-GET returns no doc(304 not modified)

– But 3 workloads traces lack last-modified times. Thus we use alternate definition:

– Hit = match in URL and size

Page 55: Web  Cache Replacements

55

Simulation Assumptions• When URL in common log file has

size zero:– If URL appeared earlier with non-zero

size, use last size in simulation– Otherwise URL is probably a dynamic

doc - don’t cache in simulation

Page 56: Web  Cache Replacements

56

Exp 1: Max Theoretical HR, WHR• Simulate infinite cache (plot 7 day moving

average) • Workload U (undergrad):

– Seasonal variation (e.g., new students in fall access new URLs)

– Cumulative HR=44.9%, WHR=31.4%

• Workload C (classroom):– Did not show high hit rate as expected– Increased HR near exams

• Workload BR (remote clients on backbone):– Hit rates over 90% due to proximity of proxy to

servers

Page 57: Web  Cache Replacements

57

Exp.1: Max Possible Hit Rate

Semester start, most new users,

so constantly decline

Spring break

Page 58: Web  Cache Replacements

58

Exp 2: Removal Policy Comparison• Simulate

– cache size = 10% or 50% of max needed of no replacement case

– all primary keys– certain primary/secondary combinations

• Graph U (undergrad):– SIZE superior primary key (with random

secondary)– Secondary key shows only marginal

improvement when primary key has many ties

• Other workloads:– SIZE superior in all workloads

Page 59: Web  Cache Replacements

59

Ex 2: Primary Key Comparison

(Cache Size = 10% of max needed)

Page 60: Web  Cache Replacements

60

Ex 2: Primary Key Comparison

(Cache Size = 10% of max needed)

Page 61: Web  Cache Replacements

61

Weighted Hit Rate• Results on best primary key are

inconclusive• Most references are from small files,

but most bytes are from large files• Why Size?

– Most accesses are for smaller documents

– A few large documents take the space of many small documents

– Concentration of large inter-reference times

Page 62: Web  Cache Replacements

62

Exp. 2: Weighted Hit Rate

Page 63: Web  Cache Replacements

63

Exp. 3: Two Level Caching• Simulate

– primary cache size = 10% or 50% of max needed for no replacement

– secondary cache size = infinite– Use best primary key for HR: SIZE in

previous exp.

Page 64: Web  Cache Replacements

64

Exp. 3: Two Level Caching

memory-starved primary cache

Page 65: Web  Cache Replacements

65

Exp. 3: Two Level Caching

“working set” of documents can t in the primary cache of only 10% of MaxNeeded for two months, after which the second level cache experiences a rapid growth in both HR and WHR.

Page 66: Web  Cache Replacements

66

Exp. 3: Two Level Caching

WHR that fluctuates throughout the collection period. Each increase in the second level cache WHR correspond to times where the Innite cache WHR increases, but the Primary cache WHR performance decreases.

Page 67: Web  Cache Replacements

67

Exp. 4: Partitioning Cache by Media

• Idea– Do clients that listen to music degrade the

performance of clients using text and graphics?

– Could a partitioned cache with one portion dedicated to audio, and the other to non-audio documents increase the WHR experienced by either audio or non-audio documents?

• Simulate– cache size = 10% of max needed– two partitions: audio and non-audio

Page 68: Web  Cache Replacements

68

Exp. 4: Partitioning Cache by Media

• In Experiment 4, – a one-level cache with SIZE as the

primary key– random as the secondary key – three partition sizes: dedicate 1/4, 1/2, or

3/4 of the cache to audio; – the rest is dedicated to non-audio

documents.

Page 69: Web  Cache Replacements

69

Exp. 4: Partitioning Cache by Media

Page 70: Web  Cache Replacements

70

Exp. 4: Partitioning Cache by Media

Page 71: Web  Cache Replacements

71

Problems to solve• Certain sorting keys have intuitive appeal.

– The first is document typedocument type. A sorting key that puts text documents at the front of the removal queue would insure low latency for text in Web pages, at the expense of latency for other document type.

– The second sorting key is refetch latencyrefetch latency. To a user of international documents, the most obvious caching criteria is one that caches documents to minimize overall latency. • A European user of North American documents

would preferentially cache those documents over ones from other European servers to avoid using heavily utilized transatlantic network links. Therefore a means of estimating the latency for refetching documents in a cache could be used as a primary sorting key.

Page 72: Web  Cache Replacements

72

Problems to solve• caching dynamic documents.

Cache is only useless for dynamic documents if the document content completely changes; otherwise a portion but not all of the cached copy remains valid. – allow caches to request the

differences between the cached version and the latest version of a document.

Page 73: Web  Cache Replacements

73

Problems to solve• For example, in response to a conditional

GET a server could send the “diff" of the current version and the version matching the Last-Modified date sent by the client; or a specific tag could allow a server to “fill-in“ a previously cached static “query response form."

– Another approach to changing semi-static pages (i.e., pages that are HTML but replaced often) is to allow Web servers to preemptively update inconsistent document copies, at least for the most popular.

Page 74: Web  Cache Replacements

74

Problems to solve• We observed a 15% to 55% WHR in a

second level cache with a primary cache that is 10% of the size needed for no replacement.

• How hit rate change if a single second level cache handled misses from a set of primary caches?

• Whereas we observed concentration in each individual workload of the five we studied, how much commonality exists between the workloads if they share a single second level cache?

Page 75: Web  Cache Replacements

75

Problems to solve• An interesting future study would be

simulation of a multi-level cache more complex than the single first and second level configuration used here.

• A final open problem is to study the interaction of removal algorithms with algorithms that identify when cached copies may be inconsistent, such as expiration times or the time of last modification for documents. For example, the Harvest cache tries to remove expired documents first.

Page 76: Web  Cache Replacements

76

Admission control• If we store the response in cache or not?

• First time not save

Page 77: Web  Cache Replacements

77

Removal frequency• On-demand: Run policy when the

size of the requested document exceeds the free room in a cache. (take time to do the removal)

• Periodically: Run policy every T time units, for some T.– If removal is time consuming

• Both on-demand and periodically: Run policy at the end of each day and on-demand (Pitkow/Recker [13]).

Page 78: Web  Cache Replacements

78

On-demand• Two arguments suggest that overhead of

simply using on-demand replacement will not be significant. – First, the class of removal policies maintains a

sorted list. If the list is kept sorted as the proxy operates, then the removal policy merely removes the head of the list for removal, which should be a fast and constant time operation.

– Second, a proxy server keeps read-only documents. Thus there is no overhead for “writing-back" a document, as there is in a virtual memory system upon removal of a page that was modified since being loaded.

Page 79: Web  Cache Replacements

79

How many to remove• Removal process is stopped when

the free cache area equals or exceeds the requested document size.

• Replace documents until a certain threshold (Pitkow and Recker's comfort level) is reached.