by ravi shankar dubasi sivani kavuri a popularity-based prediction model for web prefetching

40
By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Upload: beverly-briggs

Post on 25-Dec-2015

226 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

By Ravi Shankar Dubasi Sivani Kavuri

A Popularity-Based Prediction Model

for Web Prefetching

Page 2: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

What is Web Latency?

What is Web Caching?

How does Web Caching help in reducing Web Latency?

What is Web Prefetching?

How does Web Prefetching help in reducing Web Latency?

Does Web Prefetching really decrease Web Latency!!!!

Page 3: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Combining Caching and Prefetching.

Performance Improvement.

Why Prediction Models?

What are Prediction Models?

How aggressive Prefetching is?

How aggressive Prefetching can be?

Page 4: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

PPM (Prediction by Partial Match) Model

Slight variations to this model..

Model proposed by Xin Chen and Xiaodong Zhang.

POPULARITY BASED PREDICTION

MODEL

Page 5: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Log files

Page 6: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Access Session:

ENTER

EXIT

URL

Page 7: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

3 Major Regularities:

Regularity 1: Majority Clients start their access session from popular URLs of a server. However, majority of URLs in a server are not popular files.

Regularity 2: Majority Long access sessions are headed by popular URLs.

Regularity 3: The accessing paths in majority access sessions start from popular URLs, move to less popular URLs, and exit from the least URLs. The accessing paths in minority access sessions start from less popular URLs, and remain in the same type of URLs, and exit from the least popular URLs.

Page 8: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Popularity of URLs

How to determine popularity of URLs?How do we grade the URLs?How to determine Relative Popularity?

Grade 3 : 10<RP≤100%Grade 2 : 1<RP≤10%Grade 1 : 0.1%<RP≤1%

Grade 0 : RP≤0.1%

Page 9: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Distribution of Popularity GradesTo examine relationship between URL

popularity and access sessionDivided each trace into 4 session groupsRegularity 1 is observedObservations

Paying special attention to popular URLs which are only a small %

Is this advantageous????

Paying small attention to less popular URLs which can be large

What about this???

Page 10: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching
Page 11: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Popularity and session lengthDay 79 traces86% of access sessions started from popular

URLs, moved to less popular URLs and exited from the least popular URLs

Regularity 2 is observedThe average popularity grade decreases as the

session length increases.Observations

Clients starting with less popular URLs tend to surf among URLs with the same popularity.

Page 12: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching
Page 13: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

3 Prediction Models

1. Standard model

2. LRS model (longest repeating sequence)

3. Popularity-based model

(All models are evaluated here according to the 92 day evaluation period)

(All models use the Markov Tree representation)

Page 14: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Standard ModelNode 0 represents the root of the forestWhen Client access URL the model builds a

new tree with root AThe Counter is set to 1The counter is incremented every time that

URL is accessed in the sessionThe process continues till we complete all

the sessionsEvery path from root node to leaf node

represents the URL session for at least one client

Page 15: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

0

A/2 B/2 C/2 A’/2

B’/2

C’/2

B/2

C/2

A’/1

B’/1

C’/1

C/2

A’/1

B’/1

C’/1

A’/1

B’/1

C’/1

B’/2

C’/2

C’/2

The Three Access Sequences are:

{ABCA’B’C’}

{ABC}

{A’B’C’}

STANDARD PPM

Page 16: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Advantages and Disadvantages:Easy to build (not complex)Prediction accuracy improves

More Space required ( increases with increase in prediction order) (determined by Entropy analysis and emperical

studies)

Attempts for Space Optimization: Tree no longer resembles the regular surfing

patterns Prediction accuracy low (short tree) Small height increase rapidly increases storage

requirements.

Page 17: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

LRS ModelLRS Model keeps the longest repeating subsequences stores only long branches with frequently accessed

URLsThe server builds the tree the same way as in

standard PPMScans each branch for non-repeating sequenceIdentifies and eliminates the non-repeating sequenceThe stored longest sequence is the frequently

repeating sequence (at least one occurrence of one subsequence

belongs to an independent access sessions)

Page 18: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

0

A/2 B/2 C/2 A’/2 B’/2 C’/2

B/2

C/2

A’/1

B’/1

C’/1

C/2

A’/1

B’/1

C’/1

A’/1

B’/1

C’/1

B’/2 C’/2

C’/2

The Three Access Sequences are:

{ABCA’B’C’}

{ABC}

{A’B’C’}

Page 19: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Advantages and Disadvantages:

LRS PPM model offers a lower storage requirements and higher prediction accuracy

It has low hit rates ( because tree keeps only a small number of

frequently accessed branches (popular) it ignores prefetching for less frequently accessed URLs (unpopular) so overall prefetching rate can be low)

The Process is expensive ( To find the longest matching , the server must

have all all previous URLs of current session, thus the server must maintain sessions and update them)

Page 20: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Popularity Based Prediction ModelIt uses only the most popular URLs as root nodesEach URL in a sequence is added only once to the tree

unless the its Popularity grade is higher than the root node

Maximum tree height is based on Available memory spaceAccess session lengths

Space Optimization is done to the completed tree based on:Relative access probabilityAbsolute Number of accesses

(RAP=Number of accesses to the URL/Number of accesses to the parent URL)

Page 21: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

0

A/2

B/2

C/2

A’/1

A’/1

A’/2

B’/2

C’/2 The Three Access Sequences are:

{ABCA’B’C’}

{ABC}

{A’B’C’}

Page 22: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Advantages and Disadvantages:Space Optimization (since less number of nodes)

High Prediction Accuracy (since it includes access information)

For higher Thresholds --- HIT Ratio decreases (since unpopular files domination increases)

Page 23: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

OBSERVATIONSThe Standard PPM model without limiting branch

height.The LRS PPM model keeping the longest repeating

subsequence.Popularity-based PPM model with space optimization.

1)In Standard PPM model without limiting height of each branch, Prediction accuracy is increased

2)In LRS PPM model keeping longest repeating sequence i.e removing independent access sessions, Space is saved

3)In Popularity-based PPM model space optimization considering relative access probability,

Preserves Prediction accuracy

Page 24: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Integrating prediction model with prefetching and caching

Cache memory is divided into 2 parts.Prefect buffer Cache memory

Prefetching manager Cache manager

PREDICTION ENGINEConstructs and updates prediction model (based on requests issued)Offers prediction independently to each

client.

Page 25: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Integrated Web Caching and Prefetching Model

Page 26: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

PREDICTION ALGORITHMcurrent_context [0] : root node of T;

for length j=1 to m

current_context [j]:=NULL;

for every event R in S

for length j= 0 to m {

if current_context[j] has child node C representing event R {

node C occurrence_count:=occurrence_count +1 ;

current_context[j+1]:= node C;

}

else {

construct child node C representing event R;

node C occurence_count:=1;

current_context[j+1]:=node C;

}

current_context[0]:= root node of T;

}

Page 27: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

PREFETCHING ALGORITHM LET S be the set of all objects currently in the prefetch

buffer; LET P=Ø; //P is set of objects to be

prefetched LET TotalSize = 0; // the total size of all objects in P LET j = 0;WHILE (j ≤ n) AND (TotalSize < SIZEOF (prefetch

buffer))IF (O(j) not in cache) AND (O(j) not in prefetch buffer)

THEN Put O(j) into P ; LET TotalSize = TotalSize+SIZEOF(O(j)); j=j+1;END IFEND WHILELET M=S.P;

Page 28: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Simulation Parameters

1.Order of Prediction

2.Confidence

3.Previous requests

4.Number of predictions

5.Browsing session idle time

6.Client cache size

7.Client cache idle time

Page 29: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Performance Metrics 1.Usefulness of Predictions ( Hit ratio )2.Accuracy of Predictions3.Network traffic4.Space Optimization

( Model aims at maximizing first two metrics and minimizing last two metrics)

Maximum size of prefetched files effect both hit ratios and network traffic.

Large values »» more traffic »» high hit ratio

Page 30: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Hit Ratio Ratio between no. of requests that hit the

browser or cache and the total no. of requests .

Latency Reduction Average access latency time reduction per

request.

Space Required memory allocation measured by the no.

of nodes for building a PPM model in the web server for prefetching.

Traffic Increment Ratio between the total no. of transferred bytes

and the total no. of useful bytes for the clients minus 1.

Page 31: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Hit ratio vs threshold

Page 32: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching
Page 33: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Traffic Increment Vs Threshold

Page 34: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching
Page 35: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

Number of Nodes Vs Number of Clients

Page 36: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching
Page 37: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching
Page 38: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

CONCLUSIONS

Effective web management approach.

Makes searching and prefetching highly objective and highly efficient.

Web prefetching can have both high prediction accuracy and a low space requirement.

Page 39: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching

FUTURE WORK

To make the model more flexible.

To find more elaborate ways of making predictions.

Filtering out the effect of backward references.

Extending prediction engine to accommodate more predictions.

Page 40: By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching