locality-aware request distribution in cluster-based network servers
DESCRIPTION
Locality-Aware Request Distribution in Cluster-based Network Servers. Presented by: Kevin Boos Authors: Vivek S. Pai , Mohit Aron , et al. Rice University ASPLOS 1998 *** Figures adapted from original presentation ***. Time Warp to 1998. Rapid Internet growth Bandwidth limitations - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/1.jpg)
Locality-Aware Request Distribution in Cluster-based Network ServersPresented by: Kevin Boos
Authors: Vivek S. Pai, Mohit Aron, et al.Rice UniversityASPLOS 1998*** Figures adapted from original presentation ***
![Page 2: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/2.jpg)
2
Time Warp to 1998
Rapid Internet growth Bandwidth limitations “Cheap” PCs and “fast” LANs Need for increased throughput
![Page 3: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/3.jpg)
3
Clustered Servers
Front-End
Node
LAN (Switch
)
Back-End
NodeBack-End
NodeBack-End
Node
Client
Client
![Page 4: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/4.jpg)
4
Weighted Round Robin (WRR)
![Page 5: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/5.jpg)
5
Pure Locality-Based Distribution
![Page 6: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/6.jpg)
6
Motivation for Change
Weighted Round Robin Disregards content on back-end nodes Many cache misses Limited by disk performance
Pure Locality-Based Distribution Disregards current load on back-end nodes Uneven load distribution Inefficient use of resources
![Page 7: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/7.jpg)
7
LARD Concepts
Locality-Aware Request Distribution Goal: improve performance
Higher throughput Higher cache hit rates Reduced disk access
Even load distribution + content-based distribution The best of both algorithms
![Page 8: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/8.jpg)
8
Outline
Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing
![Page 9: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/9.jpg)
9
Outline
Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing
![Page 10: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/10.jpg)
10
Basic LARD Algorithm
Front-end maps target content to back-end nodes 1-to-1 mapping
First request for each target is assigned to the least-loaded back-end node
Subsequent requests are distributed to the same back-end node based on target content mapping Unless overloaded… Re-assigns target content to a new back-end node
![Page 11: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/11.jpg)
11
Front-End
Flow of Basic LARD
Client
AAa
AAa
![Page 12: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/12.jpg)
12
Determining Load in Basic LARD
Ask the server? Introduces unnecessary communication
Current load = number of open connections Tracked in the front-end node
Use thresholds to determine when to re-balance Low, High, and Limit Re-balance when (load > Tlimit) or
(load > Thigh and there is a “free” node with load < Tlow)
![Page 13: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/13.jpg)
13
Outline
Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing
![Page 14: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/14.jpg)
14
LARD Needs Improvement
Only one back-end node per target content Working set is a single node Front-end must limit total connections
Still need to increase throughput One node per content type is unrealistic …add more back-end nodes?
![Page 15: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/15.jpg)
15
LARD/R
LARD with Replication Maps target content to a set of back-end nodes
Working set is several nodes with similar cache content
Sends new requests to least-loaded node in set Moves nodes to/from sets based on load
imbalance Idle nodes in a low-load set are moved to higher-load set
![Page 16: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/16.jpg)
16
Front-End
Flow of LARD/R
Client
AAa
AAa
AAa
![Page 17: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/17.jpg)
17
LARD Outline
Basic LARD Algorithm Improvements to LARD Request Handoff Protocol Simulation and Results Prototype Implementation and Testing
![Page 18: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/18.jpg)
18
Determining Content Type
How do we determine content in the front-end? Front-end must see network traffic
Standard TCP Assumptions Requests are small and light Responses are big and heavy
How do we forward requests?
![Page 19: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/19.jpg)
19
Potential TCP Solutions
Simple TCP Proxy Everything must flow through front-end node
Can inspect all incoming content
Cannot respond directly from back-end to client But front-end can also inspect all outgoing content
Better for persistent connections
![Page 20: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/20.jpg)
20
TCP Connection Handoff Front-end connects
to client Inspects content Forwards request
to back-end node Returned directly
back to client from back-end node
![Page 21: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/21.jpg)
21
LARD Outline
Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing
![Page 22: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/22.jpg)
22
Evaluation Goals
Throughput Requests/second served by entire cluster
Hit rate (Requests that hit memory cache) / (total requests)
Underutilization time Time that a node’s load is ≤ 40% of Tlow
![Page 23: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/23.jpg)
23
Simulation Model
300MHz Pentium II 32MB Memory (cache) 100Mbps Ethernet Traces from web servers at Rice and IBM
![Page 24: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/24.jpg)
24
Simulation Results – Prior Work
Weighted Round Robin Lowest throughput Highest cache miss ratio But lowest idle time
Pure Locality-Based An increase in nodes decrease in cache miss ratio But idle time increases (unbalanced load) Only minor improvement over WRR
![Page 25: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/25.jpg)
25
Simulation Results – LARD & LARD/R Throughput ~4x better (8 nodes)
WRR would need nodes with a 10x larger cache size
CPU bound after 8 nodes Cache miss rate decreases Only 1% idle time on average
![Page 26: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/26.jpg)
26
Simulation Results – Throughput
![Page 27: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/27.jpg)
27
Simulation Results – Cache Misses
![Page 28: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/28.jpg)
28
Simulation Results – Idle Time
![Page 29: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/29.jpg)
29
What Affects Performance?
WRR is disk-bound, LARD/R is CPU bound Increasing CPU speed improves LARD/R, not WRR Adding more disks improves WRR, not LARD/R
LARD/R shows no improvement if a node has > 2 disks
WRR is not scalable
![Page 30: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/30.jpg)
30
LARD Outline
Basic LARD Algorithm Improvements to LARD TCP Handoff Protocol Simulation and Results Prototype Implementation and Testing
![Page 31: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/31.jpg)
31
Prototype Implementation
One front-end PC 300MHz Pentium II, 128MB RAM
6 back-end PCs 7 client PCs
166MHz Pentium Pro, 64MB RAM
100Mb Ethernet, 24-port switch
![Page 32: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/32.jpg)
32
Prototype Testing Results
![Page 33: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/33.jpg)
33
Evaluation Shortcomings
What influences the results more? LARD/R protocol? TCP handoff protocol?
![Page 34: Locality-Aware Request Distribution in Cluster-based Network Servers](https://reader036.vdocuments.mx/reader036/viewer/2022062310/56816597550346895dd86fe9/html5/thumbnails/34.jpg)
34
Conclusion
LARD and LARD/R significantly better than WRR Higher throughput Better CPU utilization More frequent cache hits Reduced disk access
Benefits of Locality-Based and Load-Balanced Scalable at low cost