load balancing in web clusters cs 213 lecture 15 from: ibm technical report
Post on 20-Dec-2015
216 views
TRANSCRIPT
![Page 1: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/1.jpg)
Load Balancing in Web Clusters
CS 213
LECTURE 15From: IBM Technical Report
![Page 2: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/2.jpg)
References
1. The State of the Art in Locally Distributed Web-server
Systems by Valeria Cardellini, Emiliano Casalicchio, Michele Colajanni and Philip S. Yu
2. L. Zhao, Y. Luo, L. Bhuyan and R. Iyer, “A Network Processor Based, Content Aware Switch”, IEEE Micro,
Special Issue on High-Performance Interconnects,
May/June 2006, pp. 72-84.
![Page 3: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/3.jpg)
Concepts
• Web server System Providing web services
Trend:
1. Increasing number of clients
2. Growing complexity of web applications
• Scalable Web server systems The ability to support large numbers of accesses and
resources while still providing adequate performance
![Page 4: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/4.jpg)
Architecture Solutions
![Page 5: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/5.jpg)
Single Node Solution
• Hardware scale-up expanding a system by adding more resources
• Software scale-up specific operating system and web server software
![Page 6: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/6.jpg)
Multiple Nodes Solution
• Local scale-out
Locally distributed Web Systems
nodes are deployed at a single network location
• Global scale-out nodes are located at different geographical locations
![Page 7: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/7.jpg)
Model Architecture
![Page 8: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/8.jpg)
Locally Distributed Web System
• Cluster Based Web System the server nodes mask their IP addresses to clients,
using a Virtual IP address corresponding to one device (web switch) in front of the set of the servers – Web switch receives all packets and then sends them to server nodes
• Distributed Web System the IP addresses of the web server nodes are visible to
clients. No web switch, just a layer 3 router may be employed to route the requests
![Page 9: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/9.jpg)
Cluster based Architecture
![Page 10: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/10.jpg)
Distributed Architecture
![Page 11: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/11.jpg)
Two ApproachesDepends on which OSI protocol layer at which the web
switch routes inbound packets
• layer-4 switch – Determines the target server when TCP SYN packet is received. Also called content-blind routing because the server selection policy is not based on http contents at the application level
• layer-7 switch – The switch first establishes a complete TCP connection with the client, examines http request at the application level and then selects a server. Can support sophisticated dispatching policies, but large latency for moving to application level – Also called Content-aware switches or Layer 5 switches in TCP/IP protocol.
![Page 12: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/12.jpg)
![Page 13: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/13.jpg)
Cluster based architecture Taxonomy
![Page 14: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/14.jpg)
Layer-4 two-way architecture
![Page 15: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/15.jpg)
Two-Way Routing• Both inbound packets and outbound packets to
the cluster pass through the web switch• Each server in the cluster has a unique private
IP address, visible to the web switch but not to clients
• The web switch rewrites inbound packets by changing the VIP address to the target server’s IP address
• For the outbound packets, the web switch rewrites the source address to its VIP
![Page 16: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/16.jpg)
Layer-4 one-way architecture
![Page 17: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/17.jpg)
Layer-4 one-way mechanisms
The requests pass through the web switch, but replies from the servers are sent directly to the clients through a separate path. Routing to the target server is done in one of following ways.
• Packet single-rewriting The web switch replaces its VIP address with selected server’s IP
address in each inbound packet
• Packet Tunneling (Packet Encapsulation) Encapsulate IP datagrams within IP datagrams (Read from paper)
• Packet Forwarding web switch rewrite the layer-2 destination address to the MAC
address of the server (See paper)
![Page 18: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/18.jpg)
Web Switch or Layer 5/7 Switch or Content Aware Switch
• Layer 4 switch– Content blind– Storage overhead– Difficult to administer
• Content-aware (Layer 5/7) switch– Partition the server’s database over different nodes– Increase the performance due to improved hit rate– Server can be specialized for certain types of request
Switch
Image Server
Application Server
HTML Server
www.yahoo.comInternet
GET /cgi-bin/form HTTP/1.1 Host: www.yahoo.com…
APP. DATATCPIP
![Page 19: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/19.jpg)
Layer-7 two-way architecture
![Page 20: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/20.jpg)
Layer-7 two-way mechanisms
• TCP gateway An application level proxy running on the web
switch mediates the communication between the client and the server – makes separate TCP connections to client and server
• TCP splicing reduce the overhead in TCP gateway. For
outbound packets, packet forwarding occurs at network level by rewriting the client IP address
![Page 21: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/21.jpg)
Layer-7 Two-way Mechanisms
• TCP gateway Application level proxy on the web switch mediates the communication between the client and the server
• TCP splicing Reduce the overhead in TCP gateway by forwarding directly by OS
kernel
user
kernel
user
![Page 22: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/22.jpg)
TCP Splicing
• Establish connection with the client– Three-way handshake
• Choose the server• Establish connection with
the server• Splice two connections• Map the sequence for
subsequent packets
SYNC
SYND,ACKC+1
Client Switch Server
Time
SYNS,ACKC+1
ACKD+1,DataC+1
ACKD+len+1 D ->S
ACKS+len+1
SYNC
ACKS+1,DataC+1D ->S
D<- SACKC+len+1,DataD+1 ACKC+len+1,DataS
+1
![Page 23: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/23.jpg)
Latency on a Linux-based switch
• Latency is reduced by TCP splicing
![Page 24: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/24.jpg)
Layer-7 one-way architecture
![Page 25: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/25.jpg)
Layer-7 one-way mechanisms
• TCP handoff Handoff protocol is layered on top of TCP.
The switch hands off the TCP connection endpoint to the server
• TCP connection hop encapsulating the IP packet in an RPX packet and
sending it to the server.
Details of TCP handoff will be covered in next class
![Page 26: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/26.jpg)
Summary
![Page 27: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/27.jpg)
Layer-4 Products
![Page 28: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/28.jpg)
Layer 7 products
![Page 29: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/29.jpg)
Design Options
• Option (a): Linux-based switch – Overhead of moving data across PCI bus– Interrupt or polling still needed
• Option (b): Put a control processor (CP) in the interface to setup connections, and execute complicated applications. Data Procesors (DPs) process packets for forwarding, classification and simple processing– But, the CP may have its own protocol stack – Ex. embedded Linux!
• Option (c): DPs handle connection setup, splicing & forwarding – But large Code Size is a huge problem due to limited instruction memory size of the DPs!
![Page 30: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/30.jpg)
Latency
02468
101214161820
1 4 16 64 256 1024
Request file size (KB)
Late
ncy o
n t
he s
wit
ch
(m
s)
Linux Splicer
SpliceNP
![Page 31: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/31.jpg)
Throughput
0
100
200
300
400
500
600
700
800
1 4 16 64 256 1024
Request file size (KB)
Th
rou
gh
pu
t (M
bp
s) Linux Splicer
SpliceNP
![Page 32: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/32.jpg)
Dispatching AlgorithmsStrategies to select the target server of the web
clusters• Static: Fastest solution to prevent web switch
bottleneck, but do not consider the current state of the servers
• Dynamic: Outperform static algorithms by using intelligent decisions, but collecting state information and analyzing them cause expensive overheads
Requirements: (1) Low computational complexity (2) Full compatibility with web standards (3) state information must be readily available without much overhead
![Page 33: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/33.jpg)
![Page 34: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/34.jpg)
Content blind approach• Static Policies: Random distributes the incoming requests uniformly with equal
probability of reaching any server
Round Robin (RR) use a circular list and a pointer to the last selected
server to make the decision
Static Weighted RR (For heterogeneous severs)
A variation of RR, where each server is assigned a weight Wi depending on its capacity
![Page 35: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/35.jpg)
Content blind approach (Cont.)• Dynamic Client state aware static partitioning the server nodes and to assign group
of clients identified through the clients information, such as source IP address
Server State Aware Least Loaded, the server with the lowest load.
Issue: Which is the server load index? Least Connection
fewest active connection first
![Page 36: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/36.jpg)
Content blind approach (Cont.)• Server State Aware Contd.
– Fastest Response
responding fastest
Weighted Round Robin
Variation of static RR, associates each server with a dynamically evaluated weight that is proportional to the server load
• Client and server state aware
Client affinity
instead of assigning each new connection to a server only on the basis of the server state regardless of any past assignment, consecutive connections from the same client can be assigned to the same server
![Page 37: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/37.jpg)
Considerations of content blind
• Static approach is the fastest, easy to implement, but may make poor assignment decision
• Dynamic approach has the potential to make better decision, but it needs to collect and analyze state information, may cause high overhead
• Overall, simple server state aware algorithm is the best choice, least loaded algorithm is commonly used in commercial products
![Page 38: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/38.jpg)
![Page 39: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/39.jpg)
Content aware approach
• Sever state aware Cache Affinity
the file space is partitioned among the server nodes.
Load Sharing
. SITEA (Size Interval Task Assignment with Equal Load)
switch determines the size of the requested file and select the target server based on this information
. CAP (Client-Aware Policy)
web requests are classified based on their impact on system resources: such as I/O bound, CPU bound
![Page 40: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/40.jpg)
Content aware approach (Cont.)
• Client state aware Service Partitioning
employ specialized servers for certain type of requests.
Client Affinity
using session identifier to assign all web transactions from the same client to the same server
![Page 41: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/41.jpg)
Content aware approach (Cont.)
• Client and server state aware LARD (Locality aware request distribution)
direct all requests to the same web object to the same server node as long as its utilization is below a given threshold.
Cache Manager
a cache manager that is aware of the cache content of all web servers.
![Page 42: Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report](https://reader037.vdocuments.mx/reader037/viewer/2022110207/56649d445503460f94a203f3/html5/thumbnails/42.jpg)