on the scale and performance of cooperative web proxy caching

Download On the Scale and Performance of Cooperative Web Proxy Caching

Post on 01-Jan-2016

36 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

On the Scale and Performance of Cooperative Web Proxy Caching. University of Washington Alec Wolman , Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin, Henry Levy. Hits. Internet. Misses. Misses. Clients. Proxy Cache. Servers. Caching for a Better Web. - PowerPoint PPT Presentation

TRANSCRIPT

  • On the Scale and Performance of Cooperative Web Proxy CachingUniversity of Washington

    Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin, Henry Levy

  • Caching for a Better WebPerformance is a major concern in the WebProxy caching is the most commonly used method to improve Web performanceDuplicate requests to the same document served from cacheHits reduce latency, network utilization, server loadMisses increase latency (extra hops)

  • Cache EffectivenessPrevious work has shown that hit rate increases with population size [ Duska et al. 97, Breslau et al. 98 ]A single proxy cache has practical limits Load, network topology, organizational constraintsOne technique to scale the client population is to have proxy caches cooperate

  • Cooperative Web Proxy CachingSharing and/or coordination of cache state among multiple Web proxy cache nodesEffectiveness of proxy cooperation depends on: Inter-proxy communication distance Size of client population served Proxy utilization and load balance

  • Cooperative Web CachingHow much benefit does cooperative caching provide in the Web environment?

  • OutlineIntroduction & related workTrace methodologyCooperative caching for small & medium scale populationsScaling to larger client populationsLatencyConclusions

  • Previous ResearchCooperative proxy caching is a popular research topic:[e.g. Chankhunthod et al. 96, Zhang et al. 97, Fan et al. 98, Krishnan et al. 98, Menaud et al. 98, Tewari et al. 98, Touch 98, Karger et al. 99 ...]Focus is on highly scalable algorithmsSome seek to scale to the entire Web

  • ChallengesNo real understanding of document sharing across diverse organizationsLittle analytic or empirical evaluation of these algorithms using realistic workloads for large-scale client populations

    Problem:Evaluating cooperative proxy caching requires multiple simultaneous traces of Web proxies, across a diverse set of organizations

  • Our ContributionWe use multi-organization traces to evaluate cooperative proxy caching at small and medium scalesWe use analytic modelling to evaluate cooperative caching at scales beyond those available in our traces

  • A Multi-Organization TraceUniversity of Washington (UW) is a large and diverse client population Approximately 50K peopleUW client population contains 200 independent campus organizationsMuseums of Art and Natural HistorySchools of Medicine, Dentistry, NursingDepartments of Computer Science, History, and Music

    A trace of UW is effectively a simultaneous trace of 200 diverse client organizations

  • Cooperation Across OrganizationsBy considering each UW organization as an independent company with its own clients and its own proxy, we can empirically evaluate cooperative caching across diverse client populations

    How much Web document reuse is there between these organizations? Place a proxy cache in front of each organization. What is the benefit of cooperative caching among these 200 proxies?

  • UW Trace CharacteristicsTrace collected at UW network border (May 1999)Filtered: requests from UW clients, responses from external Web serversMost of requests come directly from clients (0.5% come from proxies)

    Trace

    UW

    Duration

    7 days

    HTTP objects

    18.4 million

    HTTP requests

    82.8 million

    Avg. requests/sec

    137

    Total Bytes

    677 GB

    Servers

    244,211

    Clients

    22,984

  • QuestionWhat is the benefit of cooperative caching among the 200 UW organizational proxies?

  • Ideal Hit Rates for UW proxiesIdeal hit rate - infinite storage, ignore cacheability, expirations

    Average ideal local hit rate: 43%

  • Ideal Hit Rates for UW proxiesIdeal hit rate - infinite storage, ignore cacheability, expirations

    Average ideal local hit rate: 43%

    Explore benefits of perfect cooperation rather than a particular algorithm

    Average ideal hit rate increases from 43% to 69% with cooperative caching

  • Cacheable Hit Rates forUW proxiesCacheable hit rate - same as ideal, but doesnt ignore cacheability

    Cacheable hit rates are much lower than ideal (average is 20%)

    Average cacheable hit rate increases from 20% to 41% with (perfect) cooperative caching

  • Scaling Cooperative CachingOrganizations of this size can benefit significantly from cooperative cachingWe dont need cooperative caching to handle the entire UW population sizeA single proxy (or small cluster) can handle this entire population!No technical reason to use cooperative caching for this environmentIn the real world, decisions of proxy placement are often political or geographical

    How effective is cooperative caching at scales where a single cache will not work?

  • Hit Rate vs. Client PopulationCurves similar to other studies[e.g., Duska97, Breslau98]Small organizationsSignificant increase in hit rate as client population increasesThe reason why cooperative caching is effective for UWLarge organizationsMarginal increase in hit rate as client population increases

  • Extrapolation to Larger Client PopulationsUse least squares fit to create a linear extrapolation of hit rates

    Hit rate increases logarithmically with client population, e.g., to increase hit rate by 10%:Need 8 UWs (ideal)Need 11 UWs (cacheable)

    Low ceiling, though:100% at 11.3M clients (UW ideal) 61% at 2.1M clients (UW cacheable)

    A city-wide cooperative cache would get all the benefit

  • QuestionWhat is the benefit of cooperative caching among large organizations?

  • UW & Microsoft CooperationWhat if we ran a wire across Lake Washington, to connect UW & Microsoft?We collected a Microsoft proxy trace during same time period as the UW traceCombined population is ~80K clientsIncreases the UW population by a factor of 3.6Increases the MS population by a factor of 1.4

  • UW & Microsoft Traces

    Trace

    UW

    MS

    Duration

    7 days

    6.25 days

    HTTP objects

    18.4 million

    15.3 million

    HTTP requests

    82.8 million

    107.7 million

    Avg. requests/sec

    137

    199

    Total Bytes

    677 GB

    N/A

    Servers

    244,211

    360,586

    Clients

    22,984

    60,233

    Population

    ~50,000

    ~40,000

  • UW & MS Cooperative CachingIs this worth it?

  • What about Latency?From the clients perspective, latency matters far more than hitrateHow does latency change with population?

    Median latencies improve only a few 100 ms with ideal caching compared to no caching.On average, a web page consists of 4.5 HTTP objects

  • ConclusionsA negative result: without significant workload changes, designing highly-scalable cooperative proxy-cache schemes is unnecessaryLargest benefit is achieved with small populations (up to 2K-5K clients)Limited benefit of cooperation when we combined the UW & Microsoft populationsDocument cacheability is a severe limitation with current workloadsAnalytic model results:Confirm that most of benefit is obtained once you reach populations the size of a large cityFuture workloads: large-scale cooperative caching could become more relevant with different rate-of-change characteristics

  • UW Cooperative Caching Results

  • Extrapolating UW & MS Hit Rates

  • UW Latency

  • Complete Hit Rate Graph

  • Blank Slideblankness here...

Recommended

View more >