7/9/2001 edward chow content switch 1 introduction to linux-based virtual server and content switch...

73
7/9/2001 Edward Chow Content Switch 1 Introduction to Linux- based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs [email protected] The ppt file of this tutorial is available at http://cs.uccs.edu/~chow/pub/conf/pdcat/tutori al.ppt Part of this work sponsored by CCL/ITRI

Upload: sabina-bond

Post on 20-Jan-2016

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 1

Introduction to Linux-based Virtual Server and Content Switch

C. Edward ChowDepartment of Computer Science

University of Colorado at Colorado [email protected]

The ppt file of this tutorial is available at http://cs.uccs.edu/~chow/pub/conf/pdcat/tutorial.ppt

Part of this work sponsored by CCL/ITRI

Page 2: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 2

Outline of the Talk

• Overview of Content Delivery Networks• Linux-based Virtual Server• Linux-based Content Switching

Page 3: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 3

Clients

Content Delivery Network (CDN)

Host Server

MindSpring

PSINetSprint

Gloobix

QWest

@Home

UUnet

Huge Requests

Server Crash

Slow Response

Clients

Clients

Page 4: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 4

Content Delivery Problems

http://www.akamai.com

Page 5: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 5

Use Client Cache/Client Side Cache Server

Host Server

MindSpring

PSINetSprint

Gloobix

@Home

UUnet

Fewer Requests

Clients

Clients

Clients

ClientCache

ClientSideCacheServer

QWest

Fast Response

Page 6: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 6

Use Mirror Sites

Host Server

MindSpring

PSINetSprint

Gloobix

QWest

@Home

UUnet

Fewer Requests

Server

Fast Response

Clients

Clients

Clients

Mirror Site

Mirror Site

Need improvement by guiding the selection of mirror servers with server load/network bandwidth measurement

Page 7: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 7

Edge Network Cache Servers

Host Server

MindSpring

PSINetSprint

Gloobix

QWest

@Home

UUnet

Fewer Requests

Server

Fast Response

Clients

ClientsClients

ClientCache

Mirror Site

Mirror SiteEdgeNetworkCacheServer

CacheServer

CacheServer

CacheServer

CacheServer

ClientSideCacheServer

Page 8: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 8

Content Delivery Problem

• Cache Location Problem: Where to put cache servers?

• How many are needed?• When/where/how to push/delivery the content?• How about dynamic content?

Page 9: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 9

Akamai Edge Delivery Service

• Peering Bottleneck Problem: Access traffic evenly spread over 7400+ networks (no one over 5%; most << 1%) Need to put edge servers in many networks.

• 11/2000, 4 billion bits/day for 2800 sites.• Source Http://www.akamai.com

Date # of Edge Servers

# of Networks # of Countries

11/2000 6000 335 54

6/2001 9700 650 56

Page 10: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 10

Caching Dynamic Content at Web Proxies

• Active Cache Project : [PeiCao 98] Univ. Wisconsin– Cache Java applet to be executed at proxies– Choice of passing to server, delivery cached copy,

or generate dynamically.• Edge Side Include (ESI):

– XML tag to specify ESI fragment in a web page.– Each ESI fragment can have different cache/

Page 11: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 11

Edge Side Include Examplehttp://www.esi.org/

<table><tr><td colspan=“2”><esi:try> <esi:attempt> <esi:include src=http://www.myxyz.com/news/top.html onerror=“contineu” /> </esi:attempt> <esi:except> <!- -esi This spot is reserved for your company’s advertising. For more info <a href=www.myxyz.com> click here </a> - - > </esi:except></esi:try></td></tr></table>

Page 12: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 12

Solution to First Mile Problem• First Mile Problem: Hugh requests at web site of CDN• High Bandwidth Connection• Caching

– End System Cache• Client Cache• Client Site Proxy Cache Server• Mirror Site Caches

– Cache Servers in Internet• Hierarchical Cache Servers, e.g., Squid/Harvest/Adaptive Web• Edge Servers of Akamai

• Faster Server/Server Farm (Server Side Caching+Cluster)• Layer4 Load balancer+Real Servers• Content Switch+Real Servers• Distributed Packet Rewrite

Page 13: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 13

Load Balancer

or

Content Switch

Real Server

Web Server ClusterLoad balancer can run at

• Application Level — Reverse Proxy

• Kernel level — Linux Virtual Server

Load balancer can distribute requests based on

• Layer 3-4 info — fixe field/fast hash

• Layer 3-7 info — var. length/slow parsing

Real Server

Real Server

Real Server

Page 14: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 14

Comparison of Load Balancers• Reverse Proxy runs as application process requires

more memory/packet copying.• Linux Virtual Server runs in kernelno memory

Name Type Level Layer Info

Reverse Proxy/Apache/Tomcat/Servlet

SW Application 3-7

Linux Virtual Server SW Kernel 3-4

Linux Content Switch SW Kernel 3-7

Layer4 Switch (narrow def.) HW Embedded OS 3-4

Content/Web Switch HW Embedded OS 3-7

Page 15: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 15

Linux Virtual Server (LVS)• “Virtual server is a highly scalable and highly

available server built on a cluster of real servers. The architecture of the cluster is transparent to end users, and the users see only a single virtual server” with Virtual IP address (VIP).

• Http://www.linuxvirtualserver.org/

InternetVIP

Load Balancer/DirectorLinux Box

WAN/LAN

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3CIP

Client CIP: Client IP AddressVIP: Virutal IP AddressRIP: Real Server IP Address

Page 16: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 16

LVS-NAT Configuration (Network Address Translation)• All return traffic go through DirectorSlow• Modify IP addr/port #/Checksum at Director• Director and real servers at same LAN• No modification needed on real-servers• Port remapping: real web server can run

on 8080

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3CIP

Client

Switch

Page 17: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 17

LVS-NAT Configuration Step 2. Director routes Pkt

• Based on CIP, source port#, VIP and dst port#, director selects one of the real servers

• Change the dst IP addr or port # of pkt.

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

Client

Switch

CIP VIPCIP RIP1

LVS RoutingScheduling Rules

ipvsadm cmd

Page 18: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 18

LVS-NAT Configuration Step 3. Real Server Replies

• Real server retrieves response.• All real servers set default gateway to Director; like any other

NAT or IP masquerade setup• Packet will be sent back to Director.

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

3. ProcessRequest

Client

Switch

CIP VIPCIP RIP1

RIP1 CIP

Page 19: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 19

LVS-NAT Configuration Step 4. Director rewrites reply

• Director changes the dst IP addr. (RIP1) of pkt to VIP• Modify port # if needed.• Modify the checksum; send back pkt.

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

3. ProcessRequest

4. Rewrite replyClient

Switch

CIP VIPCIP RIP1

RIP1 CIP

VIP CIP

Page 20: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 20

LVS-NAT Configuration (Network Address Translation)• All return traffic go through DirectorSlow• Modify IP addr/port #/Checksum at Director.• Director and real servers at same LAN

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

3. ProcessRequest

4. Rewrite reply5. Receive reply

Client

Switch

CIP VIPCIP RIP1

RIP1 CIP

VIP CIP

Page 21: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 21

LVS-NAT Setup Commands

# make the director forward the masquerading packetsecho 1 > /proc/sys/net/ipv4/ip_forward ipchains -A forward -j MASQ -s 172.16.0.0/24 -d 0.0.0.0/0# Add virtual service and link a scheduler to it ipvsadm -A -t 202.103.106.5:80 -s wlc (Weighted Least-Connection

scheduling) ipvsadm -A -t 202.103.106.5:21 -s wrr (Weighted Round Robin scheduling ) #Add real servers and select forwarding method and weight ipvsadm -a -t 202.103.106.5:80 -R 172.16.0.2:80 -m ipvsadm -a -t 202.103.106.5:80 -R 172.16.0.3:8000 -m -w 2 ipvsadm -a -t 202.103.106.5:21 -R 172.16.0.2:21 -m

Page 22: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 22

LVS-Tunnel Configuration(IP Tunneling)

• Real Servers need to handle IP over IP packets.• Real Servers can be geographically separated and return traffic

go through different routes. • Security implication!

InternetVIPLoad Balancer

Linux Box

Real Server1

Real Server2

Real Server3

RIP1

RIP21. request

2. Scheduling/Put packet in IP Tunnel

CIP

3. ProcessRequest

4. Receive reply

Client

CIP VIPRIP0 RIP2 CIP VIP

IP TunnelIP Tunnel

IP TunnelRIP3

RIP0

VIP CIP

Page 23: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 23

LVS-Tunnel Setup Commands

#The load balancer (LinuxDirector), kernel 2.2.14echo 1 > /proc/sys/net/ipv4/ip_forward ipvsadm -A -t 172.26.20.110:23 -s wlc ipvsadm -a -t 172.26.20.110:23 -r 172.26.20.112 -i

#The real server 1, kernel 2.2.14echo 1 > /proc/sys/net/ipv4/ip_forward

# insert it if it is compiled as module insmod ipip ifconfig tunl0 172.26.20.110 netmask 255.255.255.255

broadcast 172.26.20.110 up route add -host 172.26.20.110 dev tunl0 echo 1 > /proc/sys/net/ipv4/conf/all/hidden echo 1 > /proc/sys/net/ipv4/conf/tunl0/hidden

Page 24: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 24

LVS-DR Configuration (Direct Routing)

• Real servers need to configure a non-arp alias interface with virtual IP address and that interface must share same physical segment with load balancer.

• Only Director’s interface replies to VIP ARP request.

• Director only rewrites server MAC address; IP packet not changed Fast!

Internet

VMACDirector Real

Server1

Real Server2

Real Server3

RMAC1

RMAC2

RMAC3

1. request

2. Scheduling/Rewrite packet

CIP

Client

Route/Switch

GMAC VMAC CIP VIP

VMAC RMAC3 CIP VIP

GMAC: Gateway MAC address

Page 25: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 25

LVS-DR Configuration Step 3. Process Request

• Real server returns request.

• Request goes directly throughswitch/router; not Director.

Internet

VMAC LinuxDirector Real

Server1

Real Server2

Real Server3

RMAC1

RMAC2

RMAC3

1. request

2. Scheduling/Rewrite packet

CIP 3. ProcessRequest

4. Receive replyClient

Switch

VIP CIP

GMAC VMAC CIP VIP

VMAC RMAC3 CIP VIP

RMAC3 GMAC VIP CIP

GMAC: Gateway MAC address

Page 26: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 26

LVS-DR Configuration (Direct Routing)

• Real servers need to configure a non-arp alias interface with virtual IP address and that interface must share same physical segment with load balancer.

• Load balancer only rewrites server MAC address; IP packet not changed Fast!

Internet

VMAC LinuxDirector Real

Server1

Real Server2

Real Server3

RMAC1

RMAC2

RMAC3

1. request

2. Scheduling/Rewrite packet

CIP 3. ProcessRequest

4. Receive replyClient

Switch

VIP CIP

GMAC VMAC CIP VIP

VMAC RMAC3 CIP VIP

RMAC3 GMAC VIP CIP

GMAC: Gateway MAC address

Page 27: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 27

LVS-DR Setup Commands #The load balancer (LinuxDirector), kernel 2.2.14 or later

echo 1 > /proc/sys/net/ipv4/ip_forward ipvsadm -A -t 172.26.20.110:23 -s wlc ipvsadm -a -t 172.26.20.110:23 -r 172.26.20.112 –g

#The real server 1, 172.26.20.112, kernel 2.2.14 or later

echo 1 > /proc/sys/net/ipv4/ip_forward ifconfig lo:0 172.26.20.110 netmask 255.255.255.255

broadcast 172.26.20.110 up route add -host 172.26.20.110 dev lo:0 echo 1 > /proc/sys/net/ipv4/conf/all/hidden echo 1 > /proc/sys/net/ipv4/conf/lo/hidden

Page 28: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 28

Persistence Handling in LVS• Sticky connections Examples:

– FTP control (port21), data (port20)For passive FTP, the server tells the clients the port that it listens to, the client initiates the data connection connecting to that port. For the LVS/TUN and the LVS/DR, LinuxDirector is only on the client-to-server half connection, so it is imposssible for LinuxDirector to get the port from the packet that goes to the client directly.

– SSL Session: port 443 for secure Web servers and port 465 for secure mail server, key for connection must be chosen/exchanged.

• Persistent port solution:– First accesses the service, LinuxDirector create a template between the given client

and the selected server, then create an entry for the connection in the hash table. – The template expires in a configurable time, and the template won't expire until all

its connections expire. – The connections for any port from the client will send to the server before the

template expires. – The timeout of persistent templates can be configured by users, and the default is

300 seconds

Page 29: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 29

HA-LVS ConfigurationHigh Available

Internet LinuxDirector

Real Server1

Real Server2

Real Server3

CIPClient

HeartBeat

MON

BackupDirector

MON1. When Backup Director detects Linux Director failurethrough heart beat protocol,

“graciously negotiate”the take-over of VIP

Provide fault-tolerant

2. Monitor server processes run on real servers

Route requests to server processesthat are alive. Initiate restart/repair

Page 30: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 30

Performance of LVS-based Systems

“We ran a very simple LVS-DR arrangement with one PII-400 (2.2.14 kernel)directing about 20,000 HTTP requests/second to a bank of about 20 Web servers answering with tiny identical dummy responses for a few minutes. Worked just fine.” Jerry Glomph Black, Director, Internet & Technical Operations, RealNetworks

“I had basically (1024) four class-Cs of virtual servers which were loadbalanced through a LinuxDirector (two, actually -- I used redundant directors) onto four real servers which each had the four different class-

Cs aliased on them.” "Ted Pavlic" <[email protected]>

Page 31: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 31

LVS Usage Survey 2/15/2001 Lorn KeyClusters 20 1 2 2 2

Directors

Per Cluster

2 2 2 2 2

Total Real Servers

170 12 4 15 6

RoutingMethods

DR/NAT DR NAT DR NAT

ScheduleMethods

RR/WLC WRR LC WLC WLC

Types of Real Servers

RH6.2 Linux WinLinux

LinuxSolaris

RH

ServiceOffered

WWW WWW/other

WWWDB

WWWSMTP

WWW

File SystemReplication

rsync rsync CodaNFS

Custom rsynccustom

MonitoringSoftware

Heartbeatldirectord

Nanny/Pulse

HeartbeatMon

NannyPulse

Heartbeat

Page 32: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

C. Edward ChowDepartment of Computer Science

University of Colorado at Colorado Springs

Sponsored by Computer Comm. Lab/ITRI

Page 33: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 33

Content Switch Topics

• What is a Content Switch?• What Services it Can Provide• Content Switch Example• Related Technologies• Content Switch Architecture and Basic Operations• TCP Delay Binding and Related Improvement• Content Switch Rule and Conflict Detection• Conclusion

Page 34: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 34

Content Switch (CS)

• Route packets based on high layer (Layer 5/7) headers and content.

• Examples:– Direct Web traffic based on pattern of

• URLs, cookies – URL Switching• XML Tag Value– Web Switching

– Can Route incoming email based on email address;Connect POP/IMAP based on login

• Web switches and Intel XML Director/accelerator are special cases of content switch.

Page 35: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 35

What Services It Can Provide

• Enabling premium services for e-commerce, ISP, and Web hosting providers

• Load Balancing and High Available Server Clusters: Web, E-commerce, Email, Computing, File, SAN

• Policy-based networking, differential/QoS services. • Firewall, Strengthening DoS protection, cache/firewall

load-balancing• ‘Flash-crowd' management• Email Spam Protection, Virus Detection/Removal• Applet Authentication/Filtering

Page 36: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 36

F5 VRM Solution

BIG-IP

Server Array

Webmaster

Site Inewyork.domain.com

Site IIItokyo.domain.com

Site IIlosangeles.domain.com

Userlondon.domain.com

Local DNS

3-DNS

GLOBAL-SITE

Router

BIG-IP

InternetInternet

Page 37: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 37

Intel Netstructure XML Director 7280

• Example of Rule:Server1: create */order.asp & //Amount[Value >= 10000]

Page 38: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 38

Phobos In-Switch• Only load balancing switch in a PCI card form factor

• Plugs directly into any server PCI slot

• Supports up to 8,192 servers, ensuring availability and maximum performance

• Six different algorithms are available for optimum performance: Round Robin, Weighted Percentage, Least Connections, Fastest Response Time, Adaptive and Fixed.

• Provides failover to other servers for high-availability of the web site

• U.S. Retail $1995.00

Page 39: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 39

E-Commerce Example: 1. ClientClient submits via HTTP/Post (or SOAP) the following purchase in XML:<purchase>

<customerName>CCL</customerName><customerID>111222333</customerID><item><productID>309121544</productID>

<productName>IBM Thinkpad T21</productName><unitPrice>5000</unitPrice><noOfUnits>10</noOfUnits><subTotal>50000</subTotal>

</item><item><productID>309121538</productID>

<productName>Intel wireless LAN PC Card</productName><unitPrice>200</unitPrice><noOfUnits>10</noOfUnits><subTotal>2000</subTotal>

</item><totalAmount>52000</totalAmount>

</purchase>

Page 40: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 40

E-Commerce Example: 2. Content Switch

• Content switch receives the packet.• Recognize it is a http post request from http request line

POST /purchase.cgi HTTP/1.1• Recognize it is an XML document from the meta header

content-type: TEXT/XML• Parsing XML content• Extract values of tag sequences:

52000 purchase/totalAmount CCL purchase/customerName

• Rule 1 is matched and packet is routed to one of highSpeedServers.Rule 1: if (xml.purchase/totalAmount > 5000) routeTo(highSpeedServers);Rule 2: if (xml.purchase/customerName == CCL) routeTo(specialCustomerServers);

Page 41: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 41

No Free Lunch:Penalty of Having Content Switch

Increased packet processing time.• For XML Director/Accelerator, it needs to parse XML

document and match tag sequences. 1-3? order of processing time

Layer 4 Switching Layer 7 Switchingpacket header extraction fixed short fields varying length long fieldsswitch rule matching hash table look up pattern matching

Size of XML Document (Bytes) XML Content Extract Time (ms)600 14

7000 2167104 53

Page 42: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 42

Related Technologies

• Application level solution: Proxy server; Apache/Tomcat/Servlet; Microsoft NLB

• Kernel level layer 4 load balancing solution: http://www.linuxvirtualserver.org/– Joseph Mark’s presentation– LVS-NAT(Network Address Translation) web page– LVS-IP Tunnel web page– LVS-DR (Direct Routing) web page

• Hardware solution: Cisco 11000, F5 (Big IP), Alteon Web Systems, Foundry Networks (ServerIron),Excellent information on: Foundry ServerIron Installation and Configuration Guide, May 2000.

• Routing table lookup: Longest prefix (Gupta/McKeown)

Page 43: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 43

Basic Operations of Content Switching

CS Rule Matching Algorithm

HeaderContent

Extraction

Packet Classification

CSRules

Packet Routing(Load Balancing)

CS RuleEditor

IncomingPackets

ForwardPacket

To Servers

Network Path Info

Server Load Status

CS: Content Switching

Page 44: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 44

Content Switch ArchitectureApostolopoulos

Infocom 2000

Page 45: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 45

Content Switch Architecture

Client

HashTable

Case A: Controller findsthere is an entry in its Hash Table,Route request to “sticky connection” outgoing port

Real Server1

Page 46: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 46

Content Switch Architecture

Client

HashTable

Case B: Step 1. Controller findsthere is no entry in Hash Table,Route request to content switch processor Real

Server1

Page 47: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 47

Content Switch Architecture

Client

HashTable

Case B: Step 1. Controller findsthere is no entry in Hash Table,Route request to content switch processor

Real Server1

Step2. CS processora. Extract content/Match CS rules

b.Route requestc. Setup Sequence# modification

on server side port

CSRules

pktModification

info

Page 48: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 48

Content Switch Architecture

Client

HashTable

Case B: Step 1. Controller findsthere is no entry in Hash Table,Route request to content switch processor

Real Server1

Step2. CS processora. Extract content/Match CS rules

b.Route requestc. Setup Sequence# modification

on server side port

CSRules

pktModification

info

Step 3. At server side port,Return pkts are modified

Sequence#/IP addr/ChksumRoute back to client

Page 49: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 49

Efficient Software Architecture

• Tasks: Million packets with thousand of rules to match and load balancing algorithms to run.

• How to assign tasks to the (network) processors and threads?– Packet Extraction

(Understand header formats, XML parsing)– Content Switching Rule Matching– Packet Routing

(Load Balancing, Bandwidth Control)• How Much Packet Processing Should Controllers Do?• What a controller can do?• A Typical Parallel Processing Problem?

Page 50: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 50

TCP Delay Binding (Splicing)client

content switch server

step1

step2

SYN(CSEQ)

SYN(DSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(DSEQ+1)

step4

step9

step10

step5

step6

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

step8

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1) ACK(CSEQ+lenR+1)

DATA(DSEQ+1) ACK(CSEQ+LenR+1)

ACK(DSEQ+ lenD+1) ACK(SSEQ+lenD+1)

lenR: size of http request. lenD: size of return document.

ACK(DSEQ+1)

step3

step7

ACK(SSEQ+1)

DATA(?) 2nd request ACK(?)

step11

Page 51: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 51

Improve Content Switching

• Setup CS-Real Server connections ahead of time (Persistent HTTP Connections). NetScale Reduce TCP 3-way handshake time

• Pre-allocate Server Scheme (Guess Real Server based on the TCP Sync)

• Sequence# modification on every return pkt Need to recompute checksum also.

• Filter Scheme (Offload Sequence# modification/rule matching to real servers).

• Buffering/Pipeline (aggregate) Requests

Page 52: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 52

Pre-Allocate Server Schemeclient

content switch Pre-allocatedserver

step2

SYN(CSEQ)

SYN(SSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1) step4

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(SSEQ+1)

step5

step6

ACK(SSEQ+1)

DATA(SSEQ+1)ACK(CSEQ+lenR+1)

DATA(SSEQ+1)ACK(CSEQ+LenR+1)

ACK(SSEQ+lenD+1) ACK(SSEQ+lenD+1)

.

• Guess routing decision based on IP/Port#/History• Advantage:

• Faster than TCP delay binding.• Possible direct route between client and server• Reduce session processing overhead

no need to convert server sequence #

step1

step3ACK(SSEQ + 1) ACK(SSEQ+1)

Page 53: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 53

Degenerated to TCP Delayed Binding If Guess is Wrong

client content switch

Pre-allocatedserver

step1

SYN(CSEQ)

SYN(CSEQ)

step2SYN(SSEQ)/ ACK(CSEQ+1) SYN(SSEQ)/ ACK(CSEQ+1)

step12

DATA(RSEQ+1)/ACK(CSEQ+lenR+1)DATA(SSEQ+1)/ACK(CSEQ+LenR+1)

ACK(SSEQ+lenD+1 ACK(RSEQ+lenD+1)

step6

step7

step8

SYN(CSEQ) SYN(RSEQ)/ ACK(CSEQ+1)

DATA(CSEQ+1)/ACK(RSEQ+1)

Right server

Sequence # conversion neededfor right server now

step3ACK(SSEQ + 1) ACK(SSEQ+1)

DATA(CSEQ+1)/ ACK(SSEQ+1) step4 DATA(CSEQ+1)/ACK(SSEQ+1)

step5 DATA(SSEQ+1)

FIN(CSEQ+lenR+1))Server sent HTTP 404

ACK(RSEQ+1)step9

step10

step11

Page 54: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 54

Filter Process SchemeFilter Processrun on server

client content switch

server

step1

SYN(CSEQ)

step2SYN(DSEQ)/ACK(CSEQ+1)

DATA(CSEQ+1)/ACK(DSEQ+1)

step4

step5 a

step6

step8

step10

SYN(CSEQ)

SYN(SSEQ)/ ACK(CSEQ+1)

DATA(CSEQ+1)/ACK(SSEQ+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

step9DATA(SSEQ+1)

ACK(CSEQ+lenR+1)DATA(DSEQ+1)ACK(CSEQ+LenR+1)

step5bMigrate(Data, CSEQ, DSEQ)

ACK(DSEQ+1)

ACK(SSEQ+1)

step3

step7

Page 55: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 55

Pre-allocate performance plot

Plot of response time vs document size

020000400006000080000

100000120000140000160000180000200000220000240000260000280000300000320000340000360000380000400000420000440000460000480000500000

0 10000 20000 30000 40000

bytes

mic

ros

ec

on

ds

Series1

Series2

Series3

Series4

Figure 3. Performance of Pre-allocate Server Scheme

Series 1 - Basic scheme with no rule matching module inserted, i.e., using default IPVS.

Series 2 - Basic scheme with the rule matching module inserted.

Series 3 - Pre-allocate scheme with all hits, i.e., where all pre-allocate guesses were correct.

Series 4 - Pre-allocate scheme with all misses, i.e., where all pre-allocate guesses were wrong.

Page 56: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 56

Handling multiple requestsin a Keep-Alive connection

• Determine when new request arrives– Verify that previous request has been completely received– Request data size is > 0

• Key assumption is only one outstanding request is sent at a time by client, i.e., requests are not pipelined

• Reuse connections – Store each connection control information in a

hash table keyed by real server address, once it is established.

Page 57: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 57

Quiz

• Web server keeps the TCP connection alive, expecting the browser to return for images and in-line media files.

• How many keep-alive connections are setup on IE5 and Netscape 4.7 for web page with many .jpg/.gif images?

• Can these image requests be pipelined from client browser to web server?

Page 58: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 58

Multiple HTTP Requests from One TCP Connection

• A keep alive TCP connection may include multiple HTTP “GET” requests.• Content Switch examines each “GET” request and makes new routing decision.• Content Switch establishes another connection with a different server based on the routing decision.• Those HTTP responses from different servers need to be interleaved and seen by the user as if from the same server.• Solutions: In order delivery (buffer requirement); Out of order delivery (seq# tracking)?• Problems: Should we throw away earlier html requests if receive later requests?

.

.

.

client

NAT approach

cs.jpgrocky.mid

uccs.gif

Index.htm

ContentSwitch

server1

server2

server9

Page 59: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 59

Multiple HTTP Requests from One TCP Connection

• Can servers return documents directly to client in keep-alive session case?

• Can equivalent VS-Tunnel or VS-DR be implemented using Content Switch?

.

.

.

client

cs.gif

rocky.mid

uccs.jpg

ContentSwitch

server1

server2

server9

Page 60: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 60

Content Switch Rule Survey

Survey shows that existing switches support• rules in basic (condition action) or (action condition)

form• some define condition as class, then specify the

action in separate statement or command• simple single conditional term• command line interface (to facilitate incremental

update?)• Actions can include reject, forward, put in queue (for

bandwidth control, scheduling)

Page 61: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 61

Content Switch Rule Design• Rule syntax generic to support all Intended features.• Use simple C if statement syntax rule: if (condition) { action }

– Easy to read – Allow optimization using c compiler

• Condition consists of multiple terms of – variable relational_operator value

e.g. xml.purchase/totalAmount > 50000 smtp.to == “[email protected]

cookie.name == “servlet1” bitmatch(64, 8, 0xff) == 64 # above mean TTL=64 idea from netfilter universal filter

– suffix(variable, string) e.g. suffix(url, “gif”)– regex(variable, pattern) e.g. regex(url, “/purchase”)

• Action consists of reject, forward(server| queue)loadBalance(serverGroup, loadBalancingAlgorihtm)

Page 62: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 62

Efficient CS Rule Matching

• Brute force, strict priority: Rules are executed in sequential manner.

• Efficient Rule Matching Method:– Organize Rules so that rules can be skipped

based on existing content types.– Utilize compiler optimization technique.

Page 63: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 63

Simple CS Rule Editor GUI

Page 64: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 64

Conflict Detection on Content Switching Rules

• Detect conflicts among rules or rule set.• Absolute conflict type:

r1: if (xml.purchase/customerName == “CCL”) {routeTo(r1)}r2: if (xml.purchase/customerName == “CCL”) {routeTo(r2)}

• Potential conflict type: r1: if (xml.purchase/totalAmount > 5000) {routeTo(quickServers)}r2: if (xml.purchase/totalAmount >20000) {routeTo(superServers)}

• Algorithm: Build tree with the same variable, check operator and value to see if they are the same or lead to potential conflict, compare actions to decide conflict type or duplication.

• Developed conflict detection algorithm for rules with multiple term condition. Can be applied to policy-based rules conflict detection.

• Editor can build these trees while a user enters rules and warns about conflict right away.

Page 65: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 65

XML Tag Value Extraction

• A xmlContentExtract() is built to extract the tag values of a list of unique tag sequences.

• It is based on clark cooper’s expat 1.0 xmlparser.• Its argument include the pointer to an XML

document, the pointer to the array of strings (unique xml tag squences we follow the xsl selector syntax), and the number of sequences.

• It return the list of a structure node, with the tag sequence, its attribute, and its value.

• Currently, it supports one attribute and tag sequece needs to be unique.

Page 66: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 66

Status of UCCS ACSD Project

• A Linux-based LVS content switch called LCS was developed • Sponsored by CCL/ITRI. • Based on Linux-2.2.16-3, current release LCS02.• ip_forward.c, ip_masq.c, ip_vs.c are modified to implement

basic TCP delay binding.• ip_cs.c are added for most of the content switching functions

with http header extraction and xml content extraction.• A simple Java-based ruleEdit program was created for rule

editing and conflict detection.• Rule translate program to convert the rule set into a Linux kernel

module and allow dynamic replacement of rule without restarting the system.

• LCS is being ported to Intel IXP 1200 network processor.

Page 67: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 67

LCS Demo

• We set up viva.uccs.edu as a content switch and wait and ace as two real servers.

• URL Switching demo:http://viva.uccs.edu/~lcs1/ route to ace.uccs.eduhttp://viva.uccs.edu/~lcs2/ route to wait.uccs.edu

• XML Web Switching (E-commerce applications)http://archie.uccs.edu/~acsd/lcs/xmldemo.htmlWhen the 2nd subtotal tag >=50000, route to ace.When the 2nd subtotal tag <50000, route to wait.

• Let us know if you have problem accessing them.My students may be working on LCS extension.

Page 68: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 68

LCS Rule ExampleR4: if (atoi(rule_fields[1].value) >= 50000) { return route_to("ace", NON_STICKY, saddr); }R5: if ((atoi(rule_fields[1].value) > 0) && (atoi(rule_fields[1].value) < 50000)){ IP_RULE_MSG("serevr=wait\n"); return route_to("wait", NON_STICKY, saddr); }R10: if (strstr(url, "lcs1") != NULL) { IP_RULE_MSG("server=ace\n"); return route_to("ace", NON_STICKY, saddr); }R11: if(strstr(url, "lcs2") != NULL){ IP_RULE_MSG("server=wait\n"); return route_to("wait", NON_STICKY, saddr); }

Page 69: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 69

Related Load Balancing Research Results

• Modified Apache status module to report– Total bytes to be transferred by child processes– Average document transfer speed

• Modified LB-DNS to receive server status and bandwidth probing results.

• LB-DNS returns IP-address of the best server based a weight contributed by both server load and bandwidth.

• Modified WebStone benchmark to test the performance of load balancing web server clusters.

Page 70: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 70

Load balancing Systems

Modified Web Server1

Modified Web Servern

Statistics GatheringDaemon

LBA: ModifiedDNS

Server Delay

Request for Web pages

Server Ranking/tmp/StatFile

Bandwidth Probe Results

Page 71: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 71

Connection Rate: LBA vs. Round-RobinServer connection rate for 4 servers

0

200

400

600

800

1000

Update for LBA , per sec

Conn

ectio

ns/s

ec

load balancing system round-robin

load balancing system 418.2 656.6 907.9 420 636.7 322.6 711.6 420.5 638.3 670.6 683.4 899

round-robin 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6

1 2 3 4 5 6 7 8 9 10 11 12

Round robin only run once

Page 72: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 72

Conclusion• Content Delivery Network improves internet content retrieval• LVS provides a low cost layer 4 switching service for cluster.• Linux Content Switch with generic rules can be easily

configured for wide-variety of value-added services:– Premium services– Load balancing/High Available server farm.– Firewall– Bandwidth control/Traffic shaping

• Require efficient SW/HW architecture and rule matching algorithms to reduce processing overhead.

• Content rule design/conflict detection are important and challenging.

• TCP delay binding can be improved.

Page 73: 7/9/2001 Edward Chow Content Switch 1 Introduction to Linux-based Virtual Server and Content Switch C. Edward Chow Department of Computer Science University

7/9/2001 Edward Chow Content Switch 73

References• http://www.linuxvirtualserver.org/• http://www.akamai.com/• http://cs.uccs.edu/~chow/pub/contentsw/talk/contentswitching.ppt• [Aron2000] Aron, Mohit, “Differential and predictable QoS in web server systems”, Ph.D

dissertation Rice University, Oct. 2000.• [Zhang97] Lixia Zhang, Sally Floyd, and Van Jacobson, “Adaptive Web Caching,” April 25,

1997. http://www-nrg.ee.lbl.gov/floyd/web.html• [Esi2001] Edge Side Includes, http://www.esi.org/. • [Chow2001a] C. Edward Chow and Indira Semwal, “Web Load Balancing Through More

Accurate Server Report,” Proceeding of PDCAT 2001, Taipei, Taiwan.• [Chow2001b] C. Edward Chow, Ganesh Godavari, and Jianhua Xie, “Content Switch Rules

and their Conflict Detection,” Proceeding of PDCAT 2001, Taipei, Taiwan.• [Chow2001c] C. Edward Chow and Weihong Wang, “The Design and Implementation of

Linux LVS-based Content Switch”, Proceeding of PDCAT 2001, Taipei, Taiwan.• [Aversa2000] Luis Aversa and Azer Bestavros, “Load Balancing a Cluster of Web Servers:

Using Distributed Packet Rewriting,” Proceedings of IPCCC 2000. • [Cao98] PeiCao, Jin Zhang and Kevin Beach, “Active Cache: Caching Dynamic Contents on

the Web” http://www.cs.wisc.edu/~cao/papers/active-cache.ps