qos in data networks

8/4/2019 QoS in Data Networks

1/33

Quality of Service in Data Networks Primer

Quality of Service in Data Networks PrimerChristopher Larson CCIE #12380

COPYRIGHT 2002 SUPERIOR TECHNOLOGY NETWORKS

1-1


2/33


PREFACE

This document was written to address Quality of Service in networks. It was started as part of theserver based computing project. Although the paper does not deal with server based computing

directly, server based computing performance is dependant on the quality of the network. Best

practice would dictate that Quality of Service planning not be based on any particular applicationsuch as server based computing (SBC) or remote desktop protocol (RDP) and should be done atthe enterprise level to include all possible application classes. Since planning should be done at

an enterprise level rather then the project level, no specific mention of server based computing is

found, rather a framework for implementing quality of service in the enterprise and guidelinesfor doing so are presented. Implementing QoS would have a direct effect on the quality of

performance from the SBC systems if the SBC systems were placed in the appropriate trafficclass.

Quality of Service for voice, video and data integration (AVVID) are not expressly covered inthis document. The document does contain recommendations for classifying Voice traffic in the

recommendation section. Other then the recommendations for classifying, voice is used only for

illustration of points, as the levels of service for Voice are fixed and it is easy to implement. Anyvoice implementation would not be obstructed by the guidelines in this document.

1-2


3/33


ABSTRACT

This document outlines Quality of Service, what it is, the tools used to implement Quality ofService and Guidelines and recommendations for doing so.

Keywords: Quality of Service, Quality of Service Recommendations, QoS.

1-3


4/33


Table of ContentPREFACE................................................................................................................................................................ 1-2

ABSTRACT ............................................................................................................................................................. 1-3

TABLE OF CONTENT .......................................................................................................................................... 1-4SECTION 1. INTRODUCTION ............................................................................................................................ 1-7

SECTION 2. QUALITY OF SERVICE OVERVIEW......................................................................................... 2-8

2.1 WHAT IS QUALITY OF SERVICE [QOS] .................................................................................................... 2-82.2 WHY USE QUALITY OF SERVICE ............................................................................................................... 2-92.3 QOSREQUIREMENTS FOR DATA.............................................................................................................. 2-9

2.3.1 Over Engineering............................................................................................................................... 2-92.4 SERVICE PROVIDER SELECTION AND RESPONSIBILITY .......................................................................... 2-10

SECTION 3. PLANNING QOS IN DATA NETWORKS.................................................................................. 3-10

3.1 THE RELATIVE PRIORITY MODEL OF CLASSIFYING APPLICATIONS ....................................................... 3-113.1.1 Deciding on Classes of Traffic......................................................................................................... 3-12

SECTION 4. CLASSIFICATION TOOLS ......................................................................................................... 4-13

4.1 CLASS OF SERVICE (LAYER 2)............................................................................................................... 4-134.2 TYPE OF SERVICE AND DIFFERENTIATED SERVICES CODE POINTS (LAYER 3)....................................... 4-144.3 PER HOP BEHAVIORS [PHB] ................................................................................................................. 4-15

4.3.1 Network Based Application Recognition [NBAR] ........................................................................... 4-16

SECTION 5. SCHEDULING TOOLS................................................................................................................. 5-16

5.1 WEIGHTED FAIR QUEUING [WFQ] AND CLASS-BASED WEIGHTED-FAIR QUEUING .............................. 5-165.1.1 Flow-Based Weighted Fair Queuing .......... .......... ........... ........... .......... ........... ........... .......... ........... 5-175.1.2 WFQ and IP Precedence ................................................................................................................. 5-175.1.3 Class-Based Weighted-Fair Queuing [CBWFQ]............................................................................. 5-17

5.2 WEIGHTED RANDOM EARLY DETECT [WRED] .................................................................................... 5-18

5.3 LOW LATENCY QUEUING [LLQ]........................................................................................................... 5-19SECTION 6. MANAGEMENT TOOLS ............................................................................................................. 6-20

SECTION 7. RECOMMENDATIONS................................................................................................................ 7-21

7.1 IMPLEMENTATION RECOMMENDATIONS................................................................................................ 7-217.1.1 Access Layer.................................................................................................................................... 7-22 7.1.2 Distribution Layer ........................................................................................................................... 7-257.1.3 WAN Connections............................................................................................................................ 7-25

7.2 CLASSIFICATION RECOMMENDATIONS.................................................................................................. 7-257.2.1 Voice Traffic .................................................................................................................................... 7-257.2.2 Voice Control................................................................................................................................... 7-267.2.3 Video Conferencing ......................................................................................................................... 7-267.2.4 Streaming Video............................................................................................................................... 7-26

7.2.5 Mission-Critical Data...................................................................................................................... 7-267.2.6 Less-Than-Best-Effort Data............................................................................................................. 7-267.2.7 Best-Effort Data............................................................................................................................... 7-27

APPENDIX A (APPENDIX TITLE) .........................................................................................................................1

ABBREVIATIONS AND ACRONYMS....................................................................................................................1

GLOSSARY.................................................................................................................................................................1

1-4


5/33


REFERENCES ............................................................................................................................................................1

INDEX..........................................................................................................................................................................1

1-5


6/33


TABLE OF FIGURES

Figure 4-1 Layer 2 802.1Q header with 802.1p information..................................................... 4-14

Figure 4-2 IP Version 4 Packet showing ToS IP Precedence and DSCP .................................. 4-15Figure 5-1 LLQ and CBWFQ together...................................................................................... 5-19

Figure 6-1 Quality of service lifecycle ...................................................................................... 6-20Figure 7-1 Hierarchical Switching Model ................................................................................. 7-21Figure 7-2 Server Farm.............................................................................................................. 7-23

Figure 7-2 Two models of implementation for Cat 3550.......................................................... 7-24

1-6


7/33


SECTION 1. INTRODUCTION

Many times, the first time an enterprise begins to look at, or hear of quality of service [QoS] isdue to consideration of voice over IP, Video-Teleconferencing or some application that requires

it to function properly. This can often be counter-efficient to core business objectives and the

efficiency of the network serving the systems and applications used to meet those objectives.Bandwidth can be monopolized by less-then business related, business related but not critical and

non-business or even undesirable traffic. QoS makes more efficient use of bandwidth byincreasing the drop preference of less-than-best- effort traffic and other traffic according to its

class. Of particular consequence are non-business related or less-then-best-effort

applications. Peer-to-peer media sharing and file sharing applications such as Kazza, Naptser,Morpheus and Instant Messengers fit into this category. They are the least desirable of traffic.

These applications can spread over time, gradually robbing bandwidth unnoticed or until user

complaints surface due to performance of the network or particular business application. It canoften go unnoticed and more bandwidth is purchased or culprit applications are policed in anattempt to indirectly improve bandwidth and performance of business-oriented applications. This

is a short-lived approach; important applications may improve temporarily until another less

important bandwidth-intensive application emerges. Policing policies can also become complexto administer and create static limits that arent always desirable. For example, data backups

usually occur overnight using additional bandwidth available during non-peak hours. With

policing policies, unused bandwidth might not be available, which could cause the backupprocess to carry over into morning work hours. A proactive approach is to provision classes of

traffic. This allows a QoS enabled network to dynamically adjust levels of service to applications

according to network conditions. This guide is designed as a framework for identifying and

classifying different types of traffic and recommendations for providing them with varying levelsof service.

1-7


8/33


SECTION 2. QUALITY OF SERVICE OVERVIEW

2.1 What is Quality of Service [QoS]Quality of Service or QoS is defined as the measure of performance for a system that reflects its

service availability and transmission quality.

Service availability - the basic foundation of quality of service. It ensures that services and

applications are available.

Transmission Quality -The quality of network transmissions is defined by three elements, loss,

delays and delays variations.

Loss

Loss is defined as a comparison of total packets received against total transmitted packets. Itis expressed as a percentage of packets dropped.

DelayAlso known as latency, delay is the amount of time it takes a packet to reach an end node

from a source node. Data networks experience serialization delay, which is the amount of

time that it takes to place the bits of the data packets onto the physical media, andpropagation delay or the amount of time it takes to transmit the bits of a packet across the

physical wire. In networks that carry voice there is an additional delay called packetization

delay. Packetization delay is the amount of time it takes to encode analog signals to a digital

signal.

o Fixed network delays are finite and measurable. For example the time it takes toencode and decode signals such as voice and video, the time required for

electrical signals to traverse media and for optical signals to be converted intoelectrical signals.

o Variable network delays refer to the condition of the network. Congestion is avariable network delay. Congestion can be measured at any given point in timebut it is not finite.

Delay VariationDelay variation (or jitter) is the difference in delay between packets. If one packet takes 25milliseconds to traverse the network and the next packet takes 50 milliseconds then the delay

variation would be 25 milliseconds. This is more of a concern for networks carrying voiceand video.

Quality of Service policy allows for managing service availability and transmission quality.

It can provide better levels of service and availability to systems and applications. Policies

are applied to the network infrastructure

2-8


9/33


2.2 Why use quality of service

Voice, video and mission critical applications have more stringent requirements from the

network then general traffic. Without quality of service voice, video and mission-critical

applications are degraded and could at times become unusable.

2.3 QoS Requirements for Data

In determining the quality of service requirements for data application traffic some things should

be considered:

Applications should be profiled to get an idea of their requirements and networkbehavior. These are called traffic profiles. Traffic profiles can be automatically generatedby many Network Management applications.

Dont over engineer QoS and use the relative priority model as described in section 3.1

Use no more then 4 categories of class:o Best or Platinum (Mission Critical) ERP, Transactional and in-house software.o High or Gold (Guaranteed Bandwidth) Streaming video, messaging (not e-

mail), Intranet enterprise applicationso Best-Effort (Best effort and default class) Internet, E-mail.o Less-than-Best-Effort (Higher drop preference, Optional class) Kazza and other

peer to peer network apps, FTP, Back Ups.

Try not to assign more then 3 applications to each individual protected (Platinum, Gold)class of applications. Avoid the temptation to put all applications in a protective class.

Use proactive QoS provisioning policies rather then reactive.

Obtain executive endorsement of ranking of application priority from a QoS perspective.That is not to say that one application is more important then the other. No. It is to say

that one application may have more QoS needs based on its profile and bandwidth usage

even though it may be considered a less mission critical application. Getting executiveapproval will help keep a QoS implementation project from de-railing.

2.3.1 Over Engineering

Bandwidth requirements can vary by large amounts from application to application and

sometimes even with various functions inside an application. For this reason it is not possible toprovide a standard rule for provisioning data bandwidth. Traffic analysis is required to know the

bandwidth requirements of any given application. Traffic patterns can also often vary greatly

between different versions of the same applications.

As an example of basic bandwidth provisioning and QoS over engineering we will use a

common ERP application called SAP. Lets assume that the most common transaction on the

SAP system at branch offices is Sales Order Creation. A Sales Order Transaction requires 14KBof data which would translate into 112kbps of bandwidth to keep a response time under 1

second. If SAP is provisioned as Platinum level mission critical and receives 25% of the link

capacity (For Platinum level not SAP itself) then a link size of approximately 512kbps isrequired for this service level on a 2-megabit connection. This is acceptable.

2-9


10/33


Now lets assume the enterprise implements a newer version of SAP using uncompressedHTML. The same transaction might require 490KB per transaction. If the provisioned link

remained the same (512kbps) then the response time would be 32 seconds per transaction.

This would clearly be a situation where QoS by itself is not sufficient. Additional bandwidth

would need to be purchased to maintain the same level of service. These types of calculationsshould be taken into account when implementing a QoS policy and in determining when QoS isnot the answer and more bandwidth is needed. Absolute application provisioning would require a

whole slew of calculations and assumptions that would never hold true on a day-to day-basis.

Therefore, rather than attempting to determine exact kilobits of bandwidth requirements for data

applications, a simpler and proven approach is to assign relative priorities to data applications asdiscussed in section 3.1.

2.4 Service Provider Selection and Responsibility

Many people talk about QoS not being available over the Internet or from the service provider.

You dont actually get nor need QoS from the provider. All you need is a service level agreementthat meets your needs. Quality of service is only as strong as its weakest link. For this reason the

selection of a service provider that can meet the needs of the enterprise and provide service level

agreements that are in-line with the enterprises quality of service policy is important. I will use

voice as an example as the quality of service requirements for voice are fixed and easilyexamined.

End to end requirements for voice and video conferencing are:

No more than 150 milliseconds of one-way latency mouth to ear. (Using ITU G.114)

No more then 1% loss

No more then 30 milliseconds delay variation

Considering these factors the service provider service level should be near these metrics: No more then 60 milliseconds of one-way latency mouth to ear.

No more then 0.5% loss

No more then 20 milliseconds delay variation.

The requirements for voice and videoconferencing and the service providers ability to meet the

defined SLAs are important. With this in mind remember that Voice has some very stringentrequirements and it is likely that SLAs meeting your data QoS policies can be much more

lenient and less restrictive. It is not necessary to control what happens outside of the enterprisenetwork; only to get guarantees from service providers if needed that meets the needs of the

enterprise.

SECTION 3. PLANNING QOS IN DATA NETWORKS

QoS is essentially segregating applications and giving preference to certain applications over

others. With voice and video, the need for QoS is relatively obvious. However, this is not thecase with data applications. Arriving at the design principles of relatively few classes of data

3-10


11/33


traffic and assigning only a few applications to these classes opens up a variety of subjectivenon-technical issues. This is because the enterprise is left to rank their applications in relative

priority. This process is usually very politically and organizationally charged. It is often because

of how applications are ranked. Applications should be ranked by their QoS needs according totraffic profile relative to their importance in the organization. In other words if an application is

indeed mission critical but has no need for stringent QoS it should not be placed in the sameclass as other mission-critical applications that do need better QoS. The basis of determinationshould be to first identify mission critical applications and then place them in the relative priority

model based on traffic profile. This may assist in removing some of the political barriers to

realizing a QoS implementation.

3.1 The Relative Priority Model of Classifying Applications

The first step in implementing QoS is to categorize the Enterprises applications or separate theminto classes of traffic. The relative priority model is well suited to data applications. Data traffic

can usually be easily identified as important, best-effort and less then best effort. Mission critical

traffic is generally assigned to the highest level of priority but this is determined by its trafficprofile.

Platinum Level - Mission critical: Those applications that directly contribute to the coreof the business or business operations. Examples of mission-critical applications includeERP applications, such as SAP, Oracle, and PeopleSoft, as well as proprietary

applications that were designed in-house. Some applications even though they are viewed

as mission-critical are better suited for silver or even best effort classes. E-mail for

example is considered by most organizations to be mission critical. However because e-mail is highly asynchronous there is no need to give it a gold or silver level classification.

Gold Level Secondary: These applications are generally viewed as secondary inimportance to business operations or are highly asynchronous in nature. Theseapplications include Net meeting and messaging applications, some groupware or

collaborative applications, and Intranet HTML applications.

Best-Effort - Default Class: These applications play an indirect role in normal enterpriseoperations. While some of these applications might be interactive, no bandwidth

guarantees are required. Perhaps the best examples of these types of application are E-

mail, generic Internet browsing and non-enterprise Instant Messaging applications.

Less-than-best-effort Bandwidth intensive/Non-business related: Class for

applications that are bandwidth intensive and that may not have anything to do with theenterprises' business. These applications are typically highly delay and drop insensitive,

and often the executions of such applications can span over hours. Therefore, theseapplications can be given a higher-drop preference to prevent them from robbing

bandwidth away from best-effort or higher applications. Examples of less-than-best-effort

of traffic include large file transfers, backup operations, and peer-to-peer orentertainment-media swapping applications (like Napster, Kazza, Gnutella).

3-11


12/33


3.1.1 Deciding on Classes of Traffic

It is counter efficient to assign too many priority levels. It is recommended to stick to 4 as the

relative priority model suggests. The reasons for which can be illustrated by the following

analogy. Liken a pizza to the network traffic. Assume a pizza is divided up into 32 pieces withthe largest going to the person who is hosting the pizza party. With too many pieces it would behard to tell who got the largest.

Similarly it is recommended that care be taken at what applications are assigned to what class.Assigning all or even a majority of applications to the same priority class because everything is

considered mission critical would be the same as not implementing any QoS at all.

Once applications have been categorized there needs to be a way for the network to identify the

classes of traffic and schedule their delivery.

3-12


13/33


SECTION 4. CLASSIFICATION TOOLS

The next step after categorizing data is to identify traffic that is to be given special treatment.The act of classifying the traffic is called Marking or sometimes called Coloring the traffic.

Marking is usually done at the edge of the network or as close to the source of traffic as possible.

The place where traffic is marked or trusted is called a trust boundary. The marking of trafficsets the boundary by which policy can be enforced. Any of the following can be used to identify

and mark classes of traffic:

Layer 2 Parameters - MAC address, 802.1Q Class of Service [CoS] bits, Multi-protocolLabel Switching [MPLS] experimental values

Layer 3 Parameters - Source/Destination IP address, IP Precedence, Differentiated-Services Code Points [DSCP]

Layer 4 Parameters - TCP or UDP ports Layer 7 Parameters - application signatures

It is only after traffic can be identified that QoS policy can be enforced. Traffic is marked so itcan be identified. Best-practice design recommendations are to mark with DSCP values as close

to the source of the traffic as possible. Traffic identification can be done throughout the network

by examining the markings. If markings and classification are set properly, intermediate points inthe network are not required to perform detailed identification and can simply apply QoS

scheduling policies based on the previously set values. This reduces administration and CPU

overhead.

It should also be noted that when implementing QoS all traffic ingression at the WAN edgeshould be marked. This prevents outside users and applications from marking their own traffic

and getting special treatment contrary to the enterprises QoS policy.

The mechanisms used for marking traffic are:

Class of service bits

Type of service bits and Differentiated Code Service Points [DSCP]

Per-Hop behavior

Network based application recognition

4.1 Class of Service(Layer 2)Packets can be marked using the Layer 2 class of service bits. These bits are part of the UserPriority field of the 802.1p portion of an 802.1Q header.

4-13


14/33


Preamble SFD DA SA TypeTAG

4 bytesPT DATA FCS

PRI Vlan IDCFI

First three bits used for CoS

Figure 4-1 Layer 2 802.1Q headers with 802.1p information

Figure 3-1 shows where the Class of Service bits are in an 802.1Q header. This layer 2

information can be mapped to the layer 3 Type of Service discussed in the next section.

4.2 Type of Service and Differentiated Services Code Points(Layer 3)

Layer 2 media will often change along the network. For instance a packet traveling from anEthernet segment onto a serial line. A more ever-present classification is needed at Layer 3. The

Type of Service [ToS] byte is the second byte in an IPv4 packet. The first three bits (bythemselves) of the ToS byte are referred to as the IP Precedence bits. These same bits, in

conjunction with the next three bits, are known collectively as the DSCP or DifferentiatedServices Code Point bits. This is illustrated in figure 4-2.

4-14


15/33


Version

Length

ToS

8 bits

Len ID Offset TTL Proto DATAIP-

SA

7 6 5 4 3 2 1 0

IP-

DA

FCS

IP Precedence

DS CPDifferentiated Service Code Point Bits

IETF Diffserv may use DS and 2 bitsfor flow control.

DiffServ Flow control

Figure 4-2 IP Version 4 Packet showing ToS IP Precedence and DSCP

4.3 Per Hop Behaviors [PHB]

The Internet Engineering Task Force [IETF] has defined Per-Hop Behaviors [PHB] for trafficmarking. These are defined in RFC 2597 and RFC 2598. PHBs (like Type of Service and Class

of Service bits) are used to identify service levels to be provided by nodes in the network

infrastructure. They can be directly related to a DSCP decimal value. PHB and DSCP arebecoming ever more important as the IETF continues to standardize DiffServ specifications.

The three broader classes of PHBs are:

Best Effort (BE or DSCP 0)

Assured Forwarding (Afxy)

Expedited Forwarding (EF or DSCP 46)

There are four subclasses of Assured Forwarding corresponding to IP Precedence values. Within

each of these subclasses there are 3 levels of drop-preference. For example, AF42 would refer to

Assured Forwarding Class 4 drop-preference 2.

DSCP values can be expressed in decimal form or with PHB keywords; for example DSCP EF issynonymous with DSCP 46, also DSCP AF31 is synonymous with DSCP 26.

4-15


16/33


4.3.1 Network Based Application Recognition [NBAR]

Most data applications can be identified using IP addresses or their TCP/UDP port numbers.

However there are some applications that cannot be identified using only these values. Manytimes this is due to design. Peer-to-Peer and other media sharing application like Napster orKazza deliberately negotiate ports with the express purpose of firewall penetration. When the

layer 3 and 4 information is not sufficient to identify an application Network Based Application

Recognition [NBAR] may be a solution. NBAR can be more CPU intensive then identifyingtraffic by DSCP or access-lists and is generally deployed on Internet perimeter routers. It also

may have some specific memory requirements. NBAR uses a Packet Description Language

Module [PDL] to identify traffic. Packet Description Language is basically an applicationsignature much like that used by network Intrusion Detection Systems [IDS]. In the 12.2 IOS

code there are over 70 PDLs or signatures. PDLs are module and can be added to an IOS

without an IOS upgrade. Cisco Express Forwarding [CEF] is required to use NBAR as a marking

language.

SECTION 5. SCHEDULING TOOLS

After marking traffic, the next step to implementing QoS is scheduling. Scheduling refers to the

when a packet leaves an interface. Scheduling uses the traffic marking to differentiate betweenclasses of traffic. At any port of a network device, whenever input is received faster than it can

be output there is congestion. Most devices have multiple buffers to queue these packets. The

order in which the queues are serviced by the device can be defined by the schedulingmechanism used. Queuing is likened to a funnel. Packets flow from a small end as the funnel

fills. When the funnel is full, packets will be dropped. Scheduling packets out of the funnel or

choosing which packets get dropped allows some packets preference over others and is a moreefficient way of managing traffic. A scheduling mechanism is only activated when congestion

occurs. When congestion clears the mechanism is deactivated. There are 3 suggested ways ofscheduling:

Weighted-Fair Queuing [WFQ] and Class-based Weighted-Fair Queuing [CBWFQ]

Weighted Random Early Detect [WRED]

Low Latency Queuing [LLQ]

5.1 Weighted Fair Queuing [WFQ] and Class-based Weighted-Fair

Queuing

In the Cisco IOS, serial interfaces at E1 (2.048 Mbps) speeds and below, weighted fair queuingis used by default. When no other queuing strategies are configured, all other interfaces use FIFO

by default. FIFO is a very primitive queuing strategy and WFQ was designed to overcome its

limitations. When FIFO is used, traffic is sent in the order received without regard for bandwidth

5-16


17/33


consumption or associated delays. As a result, file transfers and other high-volume networkapplications often generate series of packets of associated data. These are known as packet trains.

Packet trains are groups of packets that tend to move together through the network. These packet

trains can consume all available bandwidth, depriving other traffic of bandwidth. A goodexample of this is file transfer. They can also make the network move in waves of slowness,

recovery, slowness, recovery etc.

5.1.1 Flow-Based Weighted Fair Queuing

Flow based weighted-fair-queuing is a dynamic scheduling method that tries to provide fairbandwidth to all network traffic. Traffic is identified by a number of methods such as source and

destination address and tcp or udp ports. Weighted-fair queuing [WFQ] applies weights to

identified traffic to classify it into conversations. This weight is used to determine how muchbandwidth each conversation is allowed relative to other conversations. WFQ is an algorithm

that simultaneously schedules interactive traffic to the front of a queue to reduce response time

and fairly shares the remaining bandwidth among high-bandwidth flows. In other words, WFQ

allows you to give low-volume traffic, such as Telnet sessions, some priority over high-volumetraffic, such as FTP sessions while treating each traffic flow fairly so that neither dominates a

link. WFQ gives concurrent file transfers balanced use of link capacity; that is, when multiple

file transfers occur, the transfers are given comparable bandwidth.

5.1.2 WFQ and IP Precedence

WFQ is IP Precedence aware. Each precedence level is given a byte count of 1+ its IP

Precedence level. This is not the actual byte count used (1 + IP Precedence) but is good at

illustrating how IP Precedence awareness affects WFQ. For instance if you have a queue for eachof the IP Precedence levels, the byte count from the IP Precedence of the traffic will weight each

queue. For example:

1+2+3+4+5+6+7+8 = 36

The traffic with an IP precedence of 0 (queue 1) would be given 1/36th

of the bandwidth, IPprecedence of 1 would get 2/36ths and so on to IP precedence 7 which would get 8/36ths of the

bandwidth. (Precedence levels start at 0 and run to 7).

5.1.3 Class-Based Weighted-Fair Queuing [CBWFQ]

CBWFQ extends WFQ ability by allowing the creation of up to 64 user-defined classes of traffic

rather then queuing based solely on flows or IP precedence. CBWFQ is a merging of two oldermethods of queuing packets (custom-queuing and fair-queuing) into a more efficient mechanism.

Custom queuing guarantees bandwidth and Weighted-fair queuing dynamically ensures fairness

among queues. Each queue is serviced in a Weighted Round Robin [WRR] manner. CBWFQ isan excellent mechanism for data traffic and is very efficient. CBWFQ allows giving some classes

of traffic guaranteed minimum bandwidths as well as WFQ for fair treatment among all classes

of traffic. For instance: if a Gold Class of traffic is guaranteed 10% of link bandwidth along with

5-17


18/33


fair queuing for that class. CBWFQ will give the bandwidth guarantee in a fair manner amongthe applications within the Gold class (based on IP precedence/DSCP value), at the same time

providing fair queuing among all other classes of traffic and queues. When using CBWFQ, all

classes of traffic can use the aggregate queue. When congestion is experienced, the policy willbegin to restrict traffic to its individual queue size. In this manner traffic allocation can be

dynamic in that if a class is not using its entire queue space, remainders or whole queues areallocated to the aggregate for use by other classes. This allows all classes to fairly use availablebandwidth not being used by other classes.

NOTE: It is recommended that the total bandwidth provisioned using CBWFQ not exceed 75%of the total. This ensures that routing protocols, TCP keep alives, and other layer 2 and layer 3

protocols that are absolutely necessary to keep links up and traffic flowing have adequate

bandwidth.

5.2 Weighted Random Early Detect [WRED]

Tail drop is used for CBWFQ classes unless a class is specifically configured to use weighted

random early detect [WRED] to drop packets as a means of avoiding congestion. CBWFQ is a

congestion management technique while WRED is a congestion avoidance technique. WRED(pronounced red or weighted red) is used to avoid congestion by detecting its onset and

selectively dropping TCP packets in an attempt to keep queues from ever getting filled. WRED

allows the transmission line to be used fully at all times. Normally, network device defaultoutput buffers are allowed to fill and then begin dropping additional packets trying to enter the

queue. This is called tail-drop. WRED avoids global synchronization problems that occur

when tail drop is used as the congestion avoidance mechanism. Global synchronization occurs as

waves of congestion crest, only to be followed by troughs during which the transmission link isnot fully utilized, then repeat the process. Global synchronization manifests when multiple TCPhosts reduce their transmission rates in response to packet dropping, and then quickly increasing

rates once again when the congestion is reduced forming waves of congestion. Tail-drop

exasperates congested conditions by causing multiple resends from large groups of hosts. Whennetworks have a steady utilization, networks can even get into a rhythm where peaks and troughs

are clearly identifiable in network graphs.

Random Early Detect [RED] aims to control the average queue size by indicating to the endhosts when they should temporarily slow down transmission of packets. By randomly dropping

packets prior to periods of high congestion, RED tells the packet source to decrease its

transmission rate. Assuming the packet source is using TCP, it will decrease its transmission rateuntil all the packets reach their destination, indicating that the congestion is cleared. TCP not

only pauses, but it also restarts quickly and adapts its transmission rate to the rate that the

network can support. WRED combines the capabilities of the RED algorithm with the IPPrecedence feature to provide for preferential handling of packets. WRED drops packets

selectively based on IP precedence. Packets with a higher IP precedence are less likely to be

dropped than packets with a lower precedence. Thus, the higher the priority of a packet, the

5-18


19/33


better the chance that the packet will be delivered. By dropping some packets early rather thanwaiting until the queue is full, WRED allows the transmission line to be used fully at all times

and minimizes the likelihood of global synchronization. Packet drop probability is based on the

minimum threshold, maximum threshold, and mark probability denominator. When the averagequeue depth is above the minimum threshold, WRED starts dropping packets. The rate of drop

increases as the average queue size increases until it reaches the maximum threshold. The markprobability denominator is the fraction of packets dropped when the average queue depth is atthe minimum threshold. For example, if the denominator is 512, one out of every 512 packets is

dropped when the average queue size is at the minimum threshold. When the average queue size

is above the maximum threshold, all packets are dropped.

5.3 Low Latency Queuing [LLQ]

LLQ adds strict priority queuing to CBWFQ. Priority queues are serviced before all others.Service will continue exhaustively until there are no more packets are in the queue before

moving on to any other queues. LLQ is used in networks carrying voice. There can be multiple

LLQ queues. Each LLQ queue will be serviced until empty before moving to the next LLQqueue. All the LLQ queues will be serviced before the CBWFQ configured queues. Figure 3-3

illustrates how CBWFQ and LLQ work together. Figure 3-1 show how CBWFQ and LLQ work

together.

Output

CBWFQup to 64 queues

LLQ

Fragments

Always servicedfirst, serviced untilempty

Serviced at specifiedrate and/or roundrobin.Not serviced ifpackets waiting inLLQ queue.

Interface

Voice, Video andReal-Time traffic

Mission-Critical, Best

Effort and other

classes of traffic

Figure 5-1 LLQ and CBWFQ together

NOTE: It is recommended that the total of all LLQ queues not exceed 33% of the 75% totalbandwidth allowed for provisioning.

5-19


20/33


SECTION 6. MANAGEMENT TOOLS

Implementing QoS is not a one-time task that is implemented and forgotten. Effective QoS needs

to be monitored both in the short-term and log-term keeping historical data.

MonitorAdjust Policy

Classify

Figure 6-1 Quality of service lifecycle

Short term monitoring ensures QoS policy is having the desired effect. Long term monitoringand base lining is needed to ensure that the bandwidth configured for the various queues and

classes is still adequate as users may be added or an application upgrade (potentially changing its

traffic profile) could contribute to needing additional bandwidth to continue to support the QoSpolicies.

6-20


21/33


SECTION 7. RECOMMENDATIONS

7.1 Implementation Recommendations

Enterprise networks should follow the hierarchical method of implementation wherever possible.A hierarchical infrastructure provides the best performance in most circumstances. There are

three hierarchies.

Access Layer 2 or 3 switching. The access layer is the point at which computer systemsconnect to the network.

Distribution Layer 3 switching. Used to route traffic to its destination and distribute

traffic to the access layer.

Server User

Core

Access Layer

Layer 2 switching

Distribution Layer

Layer 3 switching

Core Layer

Layer 2 or Layer 3High Speed switching

Core Layer 2 or Layer 3 switching. The core is the high speed interconnection forcommunication among distribution switches.

Figure 7-1 Hierarchical Switching Model

Collapsed layers are layers that are combined into a single platform. For instance a collapsedcore would be on that also did the function of the distribution layer or one that did the function of

the distribution and access-layer.

7-21


22/33


7.1.1 Access Layer

Access-Layer Server Farm switches should be used to set DSCP values for application classes.

7.1.1.1Access Layer Switches

When considering the choices for access-layer devices, consider the switchs ability to classify

and mark traffic at the edge of the network via ACLs and service policies. This will allow QoS to

be offered as a service throughout the network and administered at the edge of the networkwhere CPU resources are plentiful, rather than at the distribution and core aggregation points

where enforcement of QoS classification and marking could adversely affect network

performance.

In access-layer switches, the number of queues is not as important as how those queues and their

various drop thresholds are configured and serviced. As few as two queues might be adequate forwiring closet access switches, where buffer management is less critical. How these queues are

serviced (RR, WRR, Priority Queuing, or a combination of Priority Queuing and WRR or

WRED) is less critical than the number of buffers because the scheduler process is extremely fast

when compared to the aggregate amount of traffic on say a distribution layer switch.

7.1.1.2 Selecting Server Farm Access Switches

Figure 7-1 shows a typical server farm design. The server farm switch should able to classifytraffic using ACLs and Services Policies to apply DSCP markings to traffic on ingress to thenetwork. This is used to assign class and ensure admission of traffic into the appropriate queue.

Recommended switches for use in the server farm:

Catalyst 6500 (w/ Policy Feature Card [PFC]

Catalyst 3550

7-22


23/33


Server Call ManagerH 323 gateway

Core

Access Layer

Server Farm with Layer 3aware switches

Distribution Layer

Cat 6500Layer 3 switches

Core Layer

Layer 2 or Layer 3High Speed switches

Figure 7-2 Server Farm

Although many organizations like to use the same switches within a hierarchy, switches can be

used in any combination at the access layer that meets your needs. The differences between a

Catalyst 6500 and a Catalyst 3500 are primarily capacity and modularity.

The 6500 W/PFC and the 3550 have the most intelligent and best QoS features. They have the

ability to mark at various layers and incorporate very efficient mechanisms.

7.1.1.2.1 Integrating the Catalyst 3550 at the access layer

The Catalyst 3550 is a very QoS-capable switch, ideal for the access layer and wiring closets.

The Catalyst 3550 can classify and mark traffic on the ingress to the network using ACLs(access control lists) and service policies. It is a very powerful access-layer device, able toidentify traffic flows at Layer 3 and Layer 4.

2 models of Implementation

7-23


24/33


3550s can be connected individually to the distribution layer switches or they can be stacked.Stacking is the connecting together of the individual switches to form one large logical switch

stack. There are technical limits and recommendations to how many can be stacked.

Consideration should be given to the utilization levels on trunk links due to the aggregating ofports. Figure 5-1 illustrates the two models.

Server

Core

AccessUsing Catalyst 3550

Distribution

Core

Vlan

Figure 7-3 two models of implementation for Cat 3550

7.1.1.2.2 Integrating the Catalyst 6500 at the access layer

One of the most popular campus configurations for Cisco solutions is the Catalyst 6500 switch in

both the wiring closet and the distribution and core layers. There are several compelling reasons

for this:

Supports dual supervisor engines providing the highest availability of access solutions.

Can provide in-line power to the IP phones.

Current 10/100 boards support integrated inline power. The Catalyst 6500 offers the highest growth potential with a scalable back plane and

distributed CEF.

The Catalyst 6500 supports advanced Layer 2/3 campus QoS tools.

7-24


25/33


7.1.2 Distribution Layer

Distribution-layer switches require more complex buffer management due to the flow

aggregation occurring at that layer. Cisco has chosen to use multiple thresholds within buffers

instead of continually increasing the number of buffers. This is because each time a queue isconfigured only frames meeting the queue criteria can use all of the memory associated with that

buffer. For example, assume that an Ethernet port has two queues configured, one for missioncritical applications and the default queue, which is used for www, email, FTP and Windows NTShares. If the default queue (the web, email, and file shares) begins to congest then packets are

dropped at the ingress interfaces. This happens regardless of whether or not the mission critical

application traffic is using any of its buffers. The dropped packets of the TCP-orientedapplications cause each of these applications to send the data again, aggravating the congested

condition. If this same scenario were configured with a single queue, but with multiple

thresholds, then default traffic would share the entire buffer space with the mission critical

application traffic. Only during periods of congestion, when the entire buffer memoryapproaches saturation, would the lower priority default queue traffic (HTTP and email) be

dropped. It is important to remember that each port has a finite amount of buffer memory. A

single queue has access to all the memory in the buffer. Therefore, queuing should be usedcautiously. For this reason, WRR is often used in the distribution layer. This discussion does not

imply that multiple queues are to be avoided entirely. In voice networks a separate priority queue

is required. However, every single CoS or DSCP value should not get its own queue because thesmall size of the resulting default queue will cause many TCP resends and will actually increase

network congestion.

7.1.3 WAN Connections

CBWFQ Service policies and WRED preferences applied that guarantee minimum bandwidth toapplications that require it, WFQ for other classes. Set drop preferences using WRED.

7.1.3.1Internet Connections

The connection to the ISP should employ Network Based Application Recognition to classify

less-than-best-effort application traffic such as peer-to-peer media sharing applications.

7.2 Classification Recommendations

Classification recommendations are based on the IETF drafts for PHBs and DiffServ. TheDSCP decimal equivalent, IP precedence and CoS markings are also listed for backward

compatibility.

7.2.1 Voice Traffic

Recommendation: DSCP EF (46), IP Precedence 5, COS 5Markings used as selection criteria for entry into a priority queue, or the queue with the highest

service weight and lowest drop probability in a WRR/WRED scheduling scheme.

7-25


26/33


7.2.2 Voice Control

Recommendation: DSCP AF31 (26), IP Precedence 3, CoS 3

Voice application systems will usually mark their control traffic with the appropriate DSCP and

CoS markings. However, some end devices may not have the capability to correctly classify theirown traffic.

7.2.3 Video Conferencing

Recommendation: DSCP AF41 (34), IP Precedence 4, CoS 4Video conferencing over IP [IPVC] has similar loss, delay, and delay variation requirements to

VoIP traffic.

7.2.4 Streaming Video

Recommendation: DSCP AF13 (14), IP Precedence 1, CoS 1Streaming video applications, like Video on Demand [VoD] programs, are high bandwidth

applications and can tolerate high levels of loss, delay, and delay variation. Significant QoS tools

are not required to meet the needs of these applications. However, in some enterpriseenvironments, these applications are considered more important than background applications

(such as e-mail and web browsing) and it might be desired that they be given preferential

treatment.

7.2.5 Mission-Critical Data

Recommendation: Gold class or mission-critical - DSCP AF21-23 (18,20,22), IP Precedence 2, CoS 2

Silver classDSCP AF11-AF13 (10,12,14), IP Precedence 1, CoS 1

As noted earlier although Gold is a single class, using the DSCP decimal values of 18-22 canprovide up to three subclasses of Gold applications.

7.2.6 Less-Than-Best-Effort Data

Recommendation: DSCP 2-6, IP Precedence 0, CoS 0

Non-critical, bandwidth-intensive data traffic. This traffic is delay-insensitive and should begiven the least preference of any of the classes and, as such, should be dropped sooner than any

other traffic. Less-than-best-effort traffic can be easily identified by the IP addresses of the

devices in the conversation or by well known TCP or UDP port numbers. Classification of thistype of traffic is most effectively achieved at the edge of the network through the utilization of

ACLs. IP address or TCP/UDP port numbers identify traffic.

7-26


27/33


Peer-to-peer file sharing applications, such as Napster, Kazza, and Gnutella, also fall in to thecategory of less-than-best-effort traffic. These types of applications can have considerable impact

on network utilization and they are relatively difficult to identify by IP address and/or TCP/UDP

port numbers. NBAR should be used on WAN connections to identify and classify these types ofless-than-best-effort applications.

7.2.7 Best-Effort Data

Recommendation: DSCP BE (0), IP Precedence 0, CoS 0

All other traffic should be placed in the best-effort category. This includes all non-interactive

traffic, regardless of importance.

7-27


28/33


7-28


29/33


APPENDIX A (APPENDIX TITLE)

Enter appendix text.

A-1


30/33


ABBREVIATIONS AND ACRONYMS

AB-1


31/33


GLOSSARY

GL-1


32/33


REFERENCES

1. Enterprise Quality of Service Design Cisco Systems. Http://cisco.com

R-1


33/33


INDEX

(

qos in data networks

Documents