ISP Deploy

Download ISP Deploy

Post on 13-Apr-2015




5 download

Embed Size (px)


Blue coat again


<p>White Paper</p> <p>Best Practices for ISP Deployment &gt;</p> <p>Best Practices for ISP Deployment</p> <p>Table of ContentsNetwork Design and Deployment Recommendations 1.1Design goals and assumptions 1.2 Cache farm components and design 1.2.1 Consumer Edge routers and Policy Based Routing 1.2.2 L4-7 Load Balancing Switches 1.2.3 L2 Switches 1.2.4 Proxy SG 1.2.5 Designing for Bandwidth Gain Proof of Concept 2.1 PoC Rollout and Service-In Staging Environment and Change Management 4.1 Staging environment 4.2 Change management Operations 5.1 Monitoring 5.2 Bypass Lists 5.3 Common issues and resolutions Appendices Appendix A. Recommended Settings for Service Providers Trust Destination IP (SGOS version 5.2 or above) HTTP Proxy HTTP Proxy Profile Connection Handling SGOS Feature Appendix B. Tuning Options Investigation and Reporting Tuning Appendix C. Alternative designs C.1 WCCP Appendix D. Vendor Specific Options D.1 F5 Overview Cache Load Balancing Optimization options 2 2 3 4 4 5 5 5 6 6 8 9 9 9 9 9 10 10 11 11 11 11 13 13 17 17 17 17 18 18 20 20 20 21 30</p> <p>1</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>IntroductionBlue Coat appliances are deployed primarily as a caching solution in Internet Service Provider environments globally to improve user experience and upstream bandwidth savings. Some ISPs also enable the content filtering option for regulatory or cultural reasons. Deployments cover from small ISP to Tier-1 providers that utilize dozens of Blue Coat ProxySG appliances. This document presents recommended design and deployment best practices. Configuration settings specific to ISPs are explained. Networks that deviate from this best practice may result in less than optimal results in bandwidth savings, stability, or overall performance. Some design alternatives are presented in the appendix along with the caveats of such alternatives.</p> <p>Network Design and Deployment Recommendations1.1 Design goals and assumptions Internet Service Providers deploy Blue Coat ProxySG devices in order to save upstream bandwidth costs and improve the end-user experience. The goal of this design and deployment best practice is to satisfy these requirements while delivering a scalable and robust cache farm that requires minimal changes to the existing network, complete transparency to end-users and servers, and can be supported with the least possible impact to ongoing operations. In most ISP networks bandwidth gain ranges from 20-30% but is highly dependent on design decisions such as load balancing algorithms, the nature of traffic and web access amongst an ISPs users, as well as the sizing of the ProxySG devices. Designing for bandwidth gain will be discussed later in this section. The assumption with this best practice design is that the ISP network follows a topology similar to the diagram below. The cache farm should be deployed at a point in the network that is a logical choke point to redirect web traffic into the farm while minimizing the chances of asymmetric routing. Blue Coats recommendation is to utilize Policy Based Routing on the core routers that connect the consumer edge (CE) network. Port 80 (HTTP) egress and ingress (bi-directional) traffic is routed from the CE routers to L4-7 load balancing switches that then distribute the HTTP traffic to a farm of proxy/caches. The CE routers are recommended because in most networks they provide a choke point without Asymmetric routes. The cache farm can also be tuned to the requirements of the specific consumer network behind the CE routers.2</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>Edge Router</p> <p>CE</p> <p>PE</p> <p>BGP Perring, IP Core, Internet Exchange</p> <p>ADSL</p> <p>PBR and connection to L4 switches should be done on consumer edge (CE) routers</p> <p>Access</p> <p>Edge</p> <p>Core</p> <p>1.2 Cache farm components and design As noted above the best practice design utilizes Policy Based Routing to L4-7 load-balancing switches that are connected to a cache farm. The design should include the following: -&gt; L4-7 switches should be deployed in a redundant configuration off the core switches/ routers -&gt; PBR should be routed to a floating IP/network address on the L4-7 switches for both ingress and egress traffic -&gt; L2 switches should be used between the L4-7 switches and the ProxySGs -&gt; ProxySGs should be dual homed to the L2 switches -&gt; L4-7 switches should be configured with a virtual IP (as in VRRP) to be used as the default gateway for the ProxySGs in the cache farm -&gt; L4-7 switches should be configured to use destination based load balancing Based on the ISP topology above the cache farm topology should follow the design shown below. Solid lines represent primary links and dotted lines represent backup links.</p> <p>3</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>CE</p> <p>PE</p> <p>VIP</p> <p>VIP</p> <p>BGP Peering, IP Core, Internet Exchange</p> <p>Edge</p> <p>PBR on consumer edge (CE) routers</p> <p>Core</p> <p>1.2.1 Consumer Edge routers and Policy Based Routing In order to eliminate ongoing changes or updates to configuration of core routers the PBR setting should simply forward all port 80 traffic to the VIP of the L4-7 switches. Any bypassing of port 80 traffic should be performed by the L4-7 switches and/or the ProxySGs. PBR will need to be configured both for egress and ingress. In some deployments it may be required to only perform PBR on specific subnets for example, only the consumer DSL subnets but not the enterprise customer subnets. Best practice is to perform PBR on the CE routers which would avoid this issue since each set of CE routers would already be dedicated to the different types of customers. Important: Locally hosted Akamai servers or similar internal CDN infrastructure should be excluded from the PBR. Configuration notes for specific routers may be found in the Appendix. 1.2.2 L4-7 Load Balancing Switches The L4-7 switches should be capable of load balancing based on destination. Optimal caching performance will be obtained if destination is URI or component of URI instead of destination IP. L4-7 switch load balancing must be flow based instead of packet based. Packets forwarded from the L4-7 switches must include the client IP address.4</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>Proxies should use the L4-7 switches as default gateway. L4-7 switch must be capable of passing the return traffic to the correct proxy. Configuration notes for specific L4-7 switches may be found in the Appendix. 1.2.3 L2 Switches L2 switches are used between the L4-7 switches and Proxy SGs to provide the active-standby connectivity, future expansion and cost savings. Spanning tree should not be necessary on the L2 switches and not recommended due to the extra overhead. This implies of course that the L2 switches are not directly interconnected but that the L4-7 switches and Proxy SGs are dual-homed to each L2 switch. 1.2.4 Proxy SG SGOS version Blue Coat recommends that all service providers run SGOS4.2 (the current release as of 10/1/2008 is SGOS4.2.8.6). ISP specific configuration There are some configuration options of the Proxy SG recommended for ISPs that veer from ProxySG default settings. Please refer to the Appendices for an detailed review of the recommended changes. 1.2.5 Designing for Bandwidth Gain Caching is a statistics game. The Internet contains billions and billions of objects. Each proxy can only cache a finite amount of content; in order to provide optimal performance and bandwidth savings the most popular content must be effectively distributed across a largest number proxies possible. Cache performance is a function of total number of disk spindles, memory, and CPU. The bandwidth savings of the entire cache farm is tied to total number of unique web objects the farm can cache, which in turn is tied to the total disk spindles. Memory effects both how much data can be cached in RAM as well as how many connections can be simultaneously handled. CPU dictates the total throughput of an individual proxy. Sizing should be done to satisfy throughput requirements as well as caching requirements. The solution must be capable of handling peak HTTP load, and ideally there should be enough disk space to store the equivalent of 2-3 days of HTTP traffic.</p> <p>5</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>Caching Strategy The load balancing solution should be capable of load balancing based on destination, ideally based on components of the URL rather than by IP destination. This prevents duplicate content from being distributed across the caching layer. Additionally, the load balancing layer can be used to implement and refine how the caches are deployed to maximize bandwidth gain. Refer to the tuning section of the Appendices for more details. Platform Selection Based on the above requirements there are two Blue Coat models that make sense for service providers, the 810-25 and 8100-20. Each has different pros and cons: 8100-20 Pros 8 disks in a single 4U chassis 4 CPU cores allow 2-3x the throughput of the 810-25 Cons Half the disk density of the 810-25 8 disks in 4U vs. 16 810-25 Pros Allows for the greatest disk density each system provides 4 spindles in a single rack unit 2 CPU cores allow peak throughput of 100-150 Mbps Cons Single Power supply More systems to manage as compared to 8100 platform</p> <p>Proof of Concept2.1 PoC Proof of Concept testing should focus on validation of interoperability, functionality, basic management, and troubleshooting. Performance testing should be out of scope for Proof of Concept testing. Information regarding QA and load testing done for Blue Coats products can be provided under NDA. Due to the nature of testing Blue Coat functionality and features in an ISP environment it is impractical to attempt to do any sort of simulated testing.6</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>Its simply not possible to generate traffic which simulates real users and real internet servers. A simulated test can only give a general sense of how much load a platform might handle. Bandwidth gain and precise performance measurements could never be evaluated. Additionally, every ISP has slightly different environments, routers, switches, L4-7 redirection gear. Generally speaking this equipment is very expensive equipment that cannot be easily or effectively duplicated in a lab. Blue Coat regularly does simulated tests internally and information on such tests can be provided to a customer as needed. However, if part of the POC criteria requires testing under load, Blue Coat strongly recommends using live traffic. The POC can follow the rough outline illustrated here. Lab Testing In the event that a lab environment is maintained with identical equipment to the production routing and switching environment then some basic lab integration testing is recommended. The intent of this testing is to make sure that all the pieces of equipment work together with the recommended software revisions. This step is only useful if identical equipment exists in the lab, testing on smaller versions of the same type of equipment will typically lead to misleading results. It can be done, but the experience learned from that equipment should be interpreted with caution; it should not be assumed that lessons learned on lower classes of equipment apply directly to the high end. Initial installation The proxy should be installed into its final resting place in the network. Because traffic must be redirected to the proxy it can do no harm in this location in the network. At this point basic connectivity to the internet and client networks should be established. Connections to the management network should also be set up at this time. Explicit Proxy Testing The proxy should be configured with a rule set to deny all traffic except specific source subnets. Explicit proxy should then be enabled. Basic load testing should be done to ensure that the proxy has adequate connectivity to DNS, default and backup routes, etc. Blue Coat typically recommends Web Timer as a simple tool to do this sort of testing. This can also be used to illustrate performance gains and bandwidth savings with a small group of web sites.7</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>L4-7 testing Testing basic L4-7 functionality will have some minimal impact on the production network; the degree of impact depends on the topology chosen. As a best practice Blue Coat recommends using policy based routing functionality to route traffic to the L4-7 switching devices. At this point a specific test network or set of IP addresses should be identified. In this case a test network means a network without any real user traffic on it, solely testing workstations. A policy based route must be added to the production routers to redirect port 80 traffic to the test network to the L4-7 device. This should generally be configured in stages: 1st stage redirects traffic with a single PBR for outbound port 80 traffic. IP spoofing is not enabled on ProxySG. Basic L4-7 functionality and load balancing is then tested, things like failover can also be tested at this stage. 2nd stage adds a second PBR for source port 80 return traffic. This tests to make sure the L4-7 device is returning the L4-7 traffic to the correct Blue Coat device and that there are no asymmetric routes. End to End Functionality testing At this stage the customer should test from their test network all the basic functionalities expected from Blue Coat. Customer testing: Identify specific customer subnets redirect them, if desired during off-peak hours. Monitor via access log, statistics, and actual end user experience from a real client if possible. Rollout and Service-In If the PoC was conducted on the production network the service-in is straightforward as the equipment is already in place. Deploying into the live network follows the same steps recommended above for Proof of Concept. During the PoC or the rollout it is recommended that operations and support procedures be validated. Common items to test are: -&gt; Software upgrade and downgrade/rollback -&gt; Configuration changes (see the Staging Environment and Change Management section below)</p> <p>8</p> <p>&lt; &gt;</p> <p>Best Practices for ISP Deployment</p> <p>-&gt; Troubleshoot and resolve a problem website using bypass lists (see the Operations section below) -&gt; SNMP traps for CPU load and memory pressure on the Proxy SGs -&gt; Failover and failback scenarios in the overall caching infrastructure (load balancer outage, proxy outage, etc.)</p> <p>Staging Environment and Change Management4.1 Staging environment Blue Coat highly recommends that all service provider solution also have a staging environment. This staging environment does not necessarily need to replicate the scale and completeness of the production network but can be a smaller Proxy SG where configuration changes can be validated before deploying into the live network. The Proxy SG used for staging can be: -&gt; standalone in a lab where a client PC is placed on the LAN-side port and transparently proxied to the In...</p>