supporting differentiated service classes in large ip networks.pdf

7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf

1/35

Juniper Networks, Inc.1194 North Mathilda AvenueSunnyvale, CA 94089 USA

408 745 2000 or 888 JUNIPERwww.juniper.net

Part Number:200019-001 12/01

Supporting Differentiated Service Classes

in Large IP Networks

Chuck Semeria

Technical Marketing Engineer

John W. Stewart III

Product Line Manager

White Paper


2/35

Copyright 2001, Juniper Networks, Inc.

Contents

Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Fundamentals of Differentiated Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Classic Time-division Multiplexing vs. Statistical Multiplexing . . . . . . . . . . . . . . . . . . 6Classic Time-division Multiplexing (TDM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Statistical Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Best-effort Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Differentiated Service Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

The Impact of Statistical Multiplexing on Perceived Quality of Service . . . . . . . . . . . . . . . 11Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Sources of Network Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Managing Delay While Maximizing Bandwidth Utilization . . . . . . . . . . . . . . . . 14

Jitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Impact of Jitter on Perceived QoS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Packet Loss Can Be Good . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

A Brief History of Differentiated Services in Large IP Networks . . . . . . . . . . . . . . . . . . . . . 20The First Approach: RFC 791 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20The Second Approach: The Integrated Services Model (IntServ) . . . . . . . . . . . . . . . . 21

IntServ Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21An IntServ Enhancement: Aggregation of RSVP Reservations . . . . . . . . . . . . . . 22

The Third Approach: The Differentiated Services Model (DiffServ) . . . . . . . . . . . . . . 23The IETF Architecture for Differentiated Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Differentiated Services Domain (DS Domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Differentiated Service Router Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Packet Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Traffic Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Differentiated Services Router Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Per-hop Behaviors (PHBs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Expedited Forwarding (EF PHB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Assured Forwarding (AF PHB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Default PHB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

General Observations about Differentiated Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30DiffServ Does Not Create Free Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30DiffServ Does Not Change the Speed of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30The Strictest Service Guarantees Will Be between Well-Known Endpoints . . . . . . . . 31

Support for Interprovider DiffServ Is a Business Issue . . . . . . . . . . . . . . . . . . . . . . . . . 31Providers Do Not Control All Aspects of the User Experience . . . . . . . . . . . . . . . . . . 31Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Acronym Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Requests for Comments (RFCs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Internet Drafts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Technical Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35


3/35

Copyright 2001, Juniper Networks, Inc. 3

List of Figures

Figure 1: Classic Time-division Multiplexing (TDM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Figure 2: Statistical Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Figure 3: Classic TDM vs. Statistical Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Figure 4: End-to-end Delay Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 5: Bandwidth Utilization vs. Round-trip Time (RTT) Delay . . . . . . . . . . . . . . . . . . . 15Figure 6: Jitter Makes Packet Spacing Uneven . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Figure 7: Sources of Packet Loss in IP Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Figure 8: Multiple Queues with Different Shares of a Ports Bandwidth . . . . . . . . . . . . . . 19Figure 9: Tail-drop Queue Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Figure 10: RFC 791 Bit Definitions of ToS Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Figure 11: Resource Reservation Protocol (RSVP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Figure 12: Cost Relative to Complexity of Differentiated Services Solutions . . . . . . . . . . . 23Figure 13: Differentiated Services Field (DS Field) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Figure 14: Differentiated Services Domain (DS Domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Figure 15: Packet Classifier and Traffic Conditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Figure 16: DiffServ Router Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

List of Tables

Table 1: Serialization DelayPacket Size vs. Port Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Table 2: Recommended AF DiffServ Codepoint (DSCP) Values . . . . . . . . . . . . . . . . . . . . . 29
http://hardware%20router%20wp.pdf/http://hardware%20router%20wp.pdf/


4/35

Supporting Differentiated Service Classes in Large IP Networks


4

Executive Summary

This white paper is the introduction to a series of papers published by Juniper Networks, Inc.that describe the support of differentiated service classes in large IP networks. This overview

presents the motivations for deploying multiple service classes, the fundamentals of statisticalmultiplexing, and the impact of statistical multiplexing on the quality of service delivered by a

network in terms of packet throughput, delay, jitter, and loss. We also provide a brief history ofthe various approaches that have been proposed to support differentiated service classes, a

description of the IETF DiffServ architecture, and general observations about what you canexpect from the deployment of multiple service classes in your network. The other papers inthis series provide technical discussions of queue scheduling disciplines, queue memory

management, host TCP congestion-avoidance mechanisms, and other issues related to thedeployment of multiple service classes in your network.

Perspective

Service provider IP networks have traditionally supported only public Internet service.Initially, Internet applications (e-mail, remote login, file transfer, and Web access) were not

considered mission-critical and did not have specific performance requirements forthroughput, delay, jitter, and packet loss. As a result, a single best-effort class of service (CoS)

was adequate to support all Internet applications.

However, the commercial success of the Internet has caused all of this to change, thus affecting

service providers in several ways.

I

Your IP network is now the single largest consumer of bandwidth, or at least is growing

toward this trend.

I

Your networks 24/7 availability and reliability are even more imperative. Internet serviceshave become mission-critical. For some organizations, such as online retailers or stockmarkets, the cost of an hour-long network outage can be extremely expensive.

I

You need to differentiate your company from the competition by offering a range of serviceclasses with service-level agreements (SLAs) that are specifically tailored to meet your

customers and their customers requirements.

I

You want to offer better classes of service to your premium customers and charge more forthose services.

I

You are probably considering offering services such as voice-over-IP (VoIP) or virtualprivate networks (VPNs) that have more rigid performance requirements than traditionalInternet applications.

You may also be considering deploying a variety of services over a shared IP infrastructure,

each of which has different performance requirements. In a multiservice IP network, IP routersrather than Frame Relay switches, ATM switches, or voice switches are used to access thetransmission network.

I

A larger service portfolio allows you to attract and keep new customers

I

Converged networks minimize your operating expenses, because you have fewer networksto manage.

I

A packet-based network maximizes bandwidth efficiency through the use of statisticalmultiplexing.


5/35



There are two fundamentally different approaches to supporting the delivery of multiple

service classes in large IP networks. One approach is simply to overprovision the network andthrow raw bandwidth at the problem. The other approach is to build a CoS-enabled backbone

based on bandwidth management.

Those who favor overprovisioning argue that:

I

The additional cost and complexity of managing traffic outweighs the gain it provides inbandwidth efficiency.

I

It is very difficult to monitor, verify, and account for multiple service classes in large IPnetworks.

I

You already have other CoS-enabled infrastructures (TDM and ATM) that you can use tosupport services that have strict performance requirements.

Those who favor bandwidth management argue that:

I

Bandwidth management allows you to optimize bandwidth utilization and run yournetwork at close to its maximum capacity.

I

New applications emerge, you deploy new networking equipment, and bandwidth arrives

in discrete chunks. These events rarely occur in a coordinated manner, and trafficmanagement allows you to control bandwidth and smoothly handle mismatches innetwork capacity as these transitions occur.

I

Bandwidth management allows you to increase your revenue by selling multiple serviceclasses over a shared infrastructure, such as a converged IP/MPLS backbone. A convergedinfrastructure allows you to reduce your operating expenses, to use a single access

technology, and to market a wide range of integrated products, such as Internet access,VPN access, and videoconferencing.

While the arguments for both of these approaches are convincing, the cost is roughly equal.Initially, the deployment of bandwidth management in your network involves simply enabling

specific router functions. However, there are a number of hidden training, operational, andmaintenance costs involved in successfully managing bandwidth in a production network.

Also, while it is relatively easy to understand how to manage bandwidth from an engineeringperspective, service providers have very little practical experience in supporting, debugging,

tuning, and accounting for multiple service classes in large IP networks. On the other hand, ifyou do not have the ability to throttle traffic to some degree, even a network of enormous

bandwidth can be overrun by misbehaving applications to a point that mission-critical anddelay-sensitive services are severely impacted.

Successful providers will adopt a solution that is based on a combination of overprovisioningbandwidth and MPLS traffic engineering to minimize the long-term average level of

congestion, while also deploying Integrated Services (IntServ) and Differentiated Services(DiffServ) to address the requirements of delay- and jitter-sensitive traffic during short-termperiods of congestion. It is only through a combination of technologies that you will be able to

support the delivery of differentiated service classes on a large scale and at a reasonable cost.


6/35



6

Fundamentals of Differentiated Services

To support business objectives that require multiple service classes, there is growing interest inthe mechanisms that make it possible to deliver differentiated traffic classes over a common IP

infrastructure. Because these mechanisms are widely misunderstood, we begin with adiscussion of some of the fundamental concepts that are relevant to the deployment of

differentiated service classes.

Classic Time-division Multiplexing vs. Statistical Multiplexing

Network transmission facilities are an expensive resource, as you know. Multiplexing can saveyou money by allowing many different data flows to share a common physical transmission

path, rather than requiring that each flow have a dedicated transmission path. There arecurrently two basic types of multiplexing used in data communications:

I

Time-division multiplexing (TDM)The transmission facility is divided into multiplechannels by allocating the facility to several different channels, one at a time.

I

Frequency-division multiplexing (FDM)The transmission facility is divided into multiple

channels by using different frequencies to carry different signals.

Within TDM, there are two methods of arbitrating bandwidth on an output port: the staticallocation of fixed-sized time slots and the dynamic allocation of variable-sized time slots.

Classic TDM devices switch traffic by using static arbitration to allocate input bandwidth to anequal amount of output bandwidth and by mapping traffic to a specific output time slot.

Packet switches use variable arbitration, with bandwidth allocated on demand on a per-packetbasis.

Classic Time-division Multiplexing (TDM)

Classic time-division multiplexing (TDM) is a technique that is applied to circuit-switched

networks. TDM assumes that data streams are organized into bits, bytes, or words rather thanpackets. Figure 1 illustrates the basic concept behind TDM.

Figure 1: Classic Time-division Multiplexing (TDM)

Although the following description is not the classic definition of TDM, it is sufficient to

provide a background for our discussion of differentiated service classes. At the ingress end ofthe shared link, the TDM multiplexer samples and then interleaves the five discrete input data

streams in a round-robin fashion, granting each stream the entire bandwidth of the shared linkfor a very short time. TDM guarantees that the bandwidth of the output link is never less thanthe sum of the rates of the individual input streams, because, at input, each unit of bandwidth

1

4

3

2

5

1

4

3

2

5

14 3 2514 3 25

MUX DEMUX


7/35



is mapped at configuration time to an equal-sized unit of bandwidth on the output link. At the

egress end of the shared link, the TDM demultiplexer processes the traffic and reconstructs thefive individual data streams.

There are two key features of classic TDM that are relevant to our discussion of supportingmultiple service classes:

I

First, it is not necessary to buffer data when input streams are multiplexed onto the sharedoutput link, because the capacity of the output link is always greater than or equal to the

sum of the rates of the individual input streams.

I

Second, classic TDM leads to an aggregate underutilization of bandwidth on an output

port. Assuming that you are transmitting packet data over a classic TDM system, eachinput channel consumes somewhere between zero percent and 100 percent of its available

bandwidth, depending on the burstiness of the application. If you examine the bandwidththat is not used and add this up for all of the channels in your system, you can achieve anoverall bandwidth utilization on the output port of only 10 to 15 percent, depending on the

specific behavior of your traffic.

Two common examples of classic TDM in large carrier or provider networks are:

I

A T-1 multiplexer with 28 T-1 circuits on the input side and one DS-3 circuit on the outputside, or

I

A SONET multiplexer with 4 OC-12c/STM-4s on the input side and one OC-48c/STM-16

on the output side.

Statistical Multiplexing

Statistical multiplexing is designed to support packet-switched networks by dynamicallyallocating variable-length time slots on an output port. Statistical multiplexing devices assume

that data flows are organized into packets, frames, or cells rather than bits, bytes, or words.Figure 2 illustrates the basic concept behind statistical multiplexing.

Figure 2: Statistical Multiplexing

Unlike classic TDM devices, a statistical multiplexing device does not

map each unit of inputbandwidth to an equal-sized unit of bandwidth on an output port. Statistical multiplexing

dynamically allocates bandwidth on an output port only to active input streams, making betteruse of the available bandwidth and allowing more streams to be transported across the shared

port than with other multiplexing techniques.

Stat Mux Device

21

3


8/35



8

A packet, frame, or cell arriving on one port of a statistical multiplexing device can potentially

exit from any other port of the device. The specific output port is determined by the result of alookup, based on the contents of the packet headera MAC address, a VPI/VCI, a DLCI, or anIP address. This means that there may be times when more packets, frames, or cells need to be

transmitted from a port than the given port has bandwidth to support. When this occurs, thestatistical multiplexing device places the oversubscribed packets, frames, or cells into a buffer

(queue) that is associated with the output port. The buffer absorbs packets during theextremely short periods of time when the output port experiences congestion.

Common examples of statistical multiplexing devices in large carrier or provider networksinclude:

I

IP routers,

I

Ethernet switches, and

I

Frame Relay switches.

Optimal Buffer Size

Determining the optimal size for a packet buffer is critical, because providing a packet buffer

that is too small is just as bad as providing a packet buffer that is too large.

I

Small packet buffers can cause packets from bursts to be dropped. This forces a host TCP to

reduce its transmission rate by returning to slow-start or congestion-avoidance mode. Thiscan severely reduce the sessions overall packet throughput rate.

I

Large packet buffers at each hop can cause the total round-trip time (RTT) to increase to apoint where packets that are waiting in buffers in the core of a network are retransmitted by

the source TCP even though they have not been dropped. A source TCP maintains aretransmission timer that it uses to decide when it should start retransmitting lost

packets if

it does not receive an ACK from the destination TCP.

Optimally, a router buffer needs to be large enough to absorb the burstiness of traffic flows but

small enough that the RTT remains relatively small, so that packets waiting in queues are not

mistakenly retransmitted.

The amount of memory that needs to be assigned to each queue is determined by the speed ofthe link, the behavior of the traffic, and the characteristics of the higher-layer transport protocol

that provides flow control. For a queue designed to support UDP-based, real-time applications,such as VoIP, a large packet buffer is not desirable, because it can increase end-to-end delay.

However, for a queue designed to support TCP-based applications, optimal performancerequires that the bandwidth-delay buffer size be calculated using the following formula:

Buffer_Size = (Port bandwidth) * (longest RTT flow forwarded across the port)

For example, the size of the buffer required to support a maximum round-trip delay of 100 ms

on an OC-48c/STM-16 port is ~32 MB.

Bandwidth Oversubscription

Voice networks have always been oversubscribed, in that dedicated bandwidth is not reservedfor each potential voice user. Carriers can overprovision their voice networks because there are

far more voice subscribers than there are voice calls at any given moment. Generally, it is easierto provision a voice network than a data network, because you have a much better

understanding of the call activity you expect to see at any time of the day than you do of theamount of data traffic your network will be required to transport at the same time. However,

we have all experienced situations when all circuits are busy during catastrophic events.


9/35



In packet-based networks, statistical multiplexing takes advantage of the fact that each host

attached to a network is not always active, and when it is active, data is transmitted in bursts.As a result, statistical multiplexing allows you to oversubscribe network resources and supporta greater number of flows than classic TDM using the same amount of bandwidth. This is

known as the statistical multiplexing gain

.

Figure 3 shows three hosts transmitting data. When the network uses classic TDM to access theoutput port, certain time slots remain empty, which causes the bandwidth of those time slots to

be wasted. In contrast, when the network uses statistical multiplexing to access the output

port, empty time slots are not transmitted, so this extra bandwidth can be used to support thetransmission of other statistically multiplexed flows.

Figure 3: Classic TDM vs. Statistical Multiplexing

Lets examine typical oversubscription numbers used by large service providers. Core links aretypically oversubscribed by a factor of 2X, while access links are generally oversubscribed by a

factor of 8X (8 times more than the potential capacity going into the core than the core cantransport). As long as the queues in the network usually

remain empty, the network willcontinue to provide satisfactory performance at these oversubscription levels. If the traffic

patterns in the network are well-understood, then it is possible to apply an oversubscriptionpolicy that ensures that queues do, in fact, usually remain empty. The oversubscription

capabilities supported by statistical multiplexing devices offer monetary savings. For example,an oversubscription policy of 20 percent allows packets from almost 23 E-3 circuits (775 Mbps)

to be aggregated onto a single OC-3/STM-1 circuit (155 Mbps).

Statistical Multiplexing and Multiple Service Classes

As a foundation to our discussion of differentiated service classes, there are two key features tokeep in mind regarding statistical multiplexing:

I

Statistical multiplexing requires packet buffering during transient periods of congestionwhen the output-port bandwidth is momentarily less than the sum of the rates of the input

flows seeking to use that bandwidth.

Flow 3

Flow 2

Flow 1

Multiplexer

Output Port

P1P3 P2 Bandwidth Utilization

Classic TDM

Statistical

MultiplexingExtra Bandwidth Available

Wasted Bandwidth

Statistical

Multiplexing

GainPass 1Pass 2Pass 3

Pass 1Pass 2Pass 3


10/35



10

I

Statistical multiplexing provides significantly better utilization of the output port

bandwidth than classic TDM. This enhanced utilization can be approximately four times(4X) greater than classic TDM, depending on the specific traffic flows. The higherutilization of output port bandwidth is the key benefit of statistical multiplexing when

compared with classic TDM.

Best-effort Delivery

IP routers perform statistical multiplexing because they are packet switches. The InternetProtocol is a datagram protocol, where each packet is routed independently of all other packets

without the concept of a connection. IP has traditionally offered only a single class of service,known as best-effort delivery, where all packets traversing the network were treated with the

same priority. Best-effort means that IP makes a reasonable effort to deliver each datagram toits destination with uncorrupted data, but there are no guarantees that a packet will not be

corrupted, duplicated, reordered, or misdelivered. Additionally, there are no promises withrespect to the amount of throughput, delay, jitter, or loss that a traffic stream will experience.

The network makes a best-effort attempt to satisfy its clients and does not arbitrarily discardpackets. However, best-effort service without the support of intelligent transport protocols

would lead to chaos. The only reason that best-effort works in global IP networks is becauseTCP does not compromise the network when it experiences congestion, but rather detects andthen responds smoothly to packet loss by reducing its transmission rate. TCP is the basic

building block that makes the best-effort queue the most well-behaved queue in a router,because it backs off when it experiences congestion.

Best-effort delivery is not a pejorative term. In fact, the ability to support a single best-effortservice has allowed large IP networks and the Internet to become what they are todaythe

unchallenged technology of choice for supporting mission-critical applications at a globalscale. However, there are a number of perceived issues related to IPs ability to support only a

single best-effort class of service and the potential impact on IPs continued commercialsuccess. Some carriers and providers see the need to offer multiple service levels if they are to

support the deployment of new services, each with different performance requirements, over a

shared IP infrastructure.

Differentiated Service Classes

Supporting multiple service classes for specific applications or customers is concerned with

treating packets that belong to certain data streams differently from packets that belong toother data streams. Multiple service classes are all about providing managed unfairness

tocertain traffic classes.

Differentiated service levels are supported by manipulating the key attributes of certain

streams to change the customers perception of the quality of service that the network isdelivering. These attributes include:

I

The amount of data that can be transmitted per unit of time (throughput),

I

The amount of time that it takes for data to be transmitted from one point to another point

in the network (delay or latency),

I

The variation in this delay over time (jitter) for consecutive packets in a given flow, and

I

The percentage of transmitted data that does not arrive at its destination correctly (loss).

However, the quality of service provided to a given service class can be only as good as thelowest quality of service delivered by the weakest link in the end-to-end path.


11/35



The concept of multiple service classes is not applicable to classic TDM services, because if a

TDM link is up, then bandwidth, delay, and jitter are constant, and the percent of packet loss iszero. Any errors that occur result from bandwidth going to zero, delay going to infinity, andloss going to 100 percent. For classic TDM services, the concept of differentiated service classes

involves providing different uptime commitments and meeting different customer servicerequirements, restoration times, and so forth.

The need to provide multiple service classes for customers or applications applies much moreto the delivery of statistically multiplexed services. This is because specific packet flows of

interest traverse several routers, and the quality of service

perceived by individual users is afunction of the way that statistical multiplexing is performed at each hop in the path, as well as

the characteristics of the individual links in the path. By treating some packets differently fromothers when performing statistical multiplexing, a network of routers can offer different kinds

of throughput, delay, jitter, and loss for different packet flows.

Finally, supporting differentiated service classes through bandwidth reservations or lower

oversubscription factors for higher-priority services results in a less efficient use of networkbandwidth than if you provide only a single best-effort statistical multiplexing service.However, you can compensate for your lower bandwidth efficiency by charging your

subscribers a premium for higher-priority services.

NOTE

Once you make the business decision to offer multiple levels of service, it is important to

perform the analysis that is necessary to determine exactly how much more you need to chargeyour subscribers to maintain your profit margins and to compensate for your loss of

bandwidth efficiency.

The Impact of Statistical Multiplexing on Perceived Quality of Service

In this section, we examine how the statistical multiplexing performed by routers can influence

the user's perception of the quality of service delivered by a network. The quality of service

attributes that can be affected by statistical multiplexing include:

I

Throughput,

I

Delay,

I

Jitter, and

I

Loss.

Throughput

Throughput

is a generic term used to describe the capacity of a system to transfer data. It is easyto measure the throughput for a TDM service, because the throughput is simply the bandwidth

of the transmission channel. For example, the throughput of a DS-3 circuit is 45 Mbps.However, for TCP/IP statistically multiplexed services, throughput is much harder to define

and measure, because there are numerous ways that it can be calculated, including:

I

The packet or byte rate across the circuit,

I

The packet or byte rate of a specific application flow,

I

The packet or byte rate of host-to-host aggregated flows, or

I

The packet or byte rate of network-to-network aggregated flows.


12/35



12

The most direct way that a router's statistical multiplexing can be tuned to affect throughput is

by the amount of bandwidth it allocates to different types of packets.

I

In classic best-effort service, the router does not specifically control the amount of

bandwidth assigned to different traffic classes. Instead, during periods of congestion, allpackets are placed into a single first-in, first-out (FIFO) queue. When faced with congestion,

User Datagram Protocol (UDP) flows continue to transmit at the same rate, but TCP flowsdetect and then react to packet loss by reducing their transmission rate. As a result, UDPflows end up consuming the majority of the bandwidth on the congested port, but each

TCP flow receives a roughly equal share of the leftover bandwidth.

I

When attempting to support differentiated treatment for different traffic classes, each class

of traffic can be given different shares of output-port bandwidth. For example, a router canbe configured to allocate different amounts of bandwidth to each class of traffic on the

output port, or one class of traffic can be given strict priority over all other classes, or oneclass of traffic can be given strict priority with a bandwidth limit (to prevent the starvation

of the other classes) over all other classes. The support of differentiated service classesimplies the use of more than just a single FIFO queue on each output port.

Delay

Delay

(or latency

) is the amount of time that it takes for a packet to be transmitted from one

point in a network to another point in the network. There are a number of factors thatcontribute to the amount of delay experienced by a packet as it traverses your network:

I

Forwarding delay,

I

Queuing delay,

I

Propagation delay, and

I

Serialization delay.

Figure 4 illustrates that the end-to-end delay can be calculated as the sum of the individual

forwarding, queuing, serialization, and propagation delays occurring at each node and link inyour network.

Figure 4: End-to-end Delay Calculation

However, when examining the causes of application delay in your network, it is important to

remember that the routers represent only a part of the end-to-end path and that you must alsoconsider several other factors:

Ingress

Router

Egress

Router

D (Node) D (Link)

D (Forwarding) D (Queuing) D(Serialization) D (Propagation)


13/35



I

The performance bottlenecks within hosts and servers

I

Operating system scheduling delays

I

Application resource contention delays

I

Physical layer framing delays

I

CODEC encoding, compression, and packetization delays

I The quality of the different TCP/IP implementations running on these end systems

I The stability of routing in the network

Sources of Network Delay

In this section, we examine each of the sources of delayforwarding, queuing, propagation,

and serialization delay.

Forwarding Delay

Forwarding delay is the amount of time that it takes a router to receive a packet, make a

forwarding decision, and then begin transmitting the packet through an uncongested outputport. This represents the minimum amount of time that it takes the router to perform its basicfunction and is typically measured in tens or hundreds of microseconds (0.000001 sec). Other

than deploying the industry standard in hardware-based routers, you have no real control overforwarding delay.

Queuing Delay

Queuing delay is the amount of time that a packet has to wait in a queue as the system

performs statistical multiplexing and while other packets are serviced before it can betransmitted on the output port. The queuing delay at a given router can vary over time

between zero seconds for an uncongested link, to the sum of the times that it takes to transmiteach of the other packets that are queued ahead of it. During periods of congestion, the queue

memory management and queue scheduling disciplines allow you to control the amount ofqueuing delay experienced by different classes of traffic placed in different queues.

Propagation Delay

Propagation delay is the amount of time that it takes for electrons or photons to traverse a

physical link. The propagation delay is based on the speed of light and is measured inmilliseconds (0.001 sec). When estimating the propagation delay across a point-to-point link,you can assume one millisecond (1 ms) of propagation delay per 100 mile (160 km) round-trip

distance. Consequently, the speed-of-light propagation RTT delay from San Francisco to NewYork (6000 mi, 9654 km) is between 60 to 70 ms (0.060 sec to 0.070 sec). Because you cant

change the speed of light in optical fiber, you have no control over propagation delay.

It is interesting to note that the speed of light in optical fiber is approximately 65 percent of the

speed of light in a vacuum, while the speed of electron propagation through copper is slightlyfaster, at 75 percent of the speed of light. Although the signal representing each bit travels

slightly faster in copper than in fiber, fiber has numerous advantages over copper, because itresults in fewer bit errors, supports longer cable runs between repeaters, and allows more bits

to be packed into a given length of cable. For example, a 10 Mbps copper interface (traditionalEthernet) transports 78 bits per mile (124 bits per km), resulting in a 1500-byte packet that is


14/35



154 miles (248 km) long. In contrast, a 2.488 Gbps fiber interface (OC-48c/STM-16) transports

19,440 bits per mile (31,104 bits per km), creating a 1500-byte packet that is only 3,260 feet (994m) long.

Serialization Delay

Serialization delay is the amount of time that it takes to place the bits of a packet onto the wirewhen a router transmits a packet. Serialization delay is measured in milliseconds (ms, or 0.001sec) and is a function of the size of the packet and the speed of the port. Since there is no

practical mechanism to control the size of the packets in your network (other than reducing theMTU or forcing packet fragmentation), the only action you can take to reduce serializationdelay is to install higher-speed router interfaces.

Table 1 displays the serialization delay for various packet sizes and different port speeds.

Table 1: Serialization DelayPacket Size vs. Port Speed

From Table 1, you can see that it takes 7.7 ms to place a 1500-byte packet on a DS-1 circuit. Thisis a significant amount of time if you consider that the typical one-way propagation delay from

San Francisco to New York (3000 mi, 4827 km) is between 30 and 35 ms. On the other hand, theserialization delay for a 1500-byte packet on an OC-192c/STM-64 port is only 0.0012 ms. In a

network consisting of high-speed interfaces, serialization delay contributes an insignificantamount to the overall end-to-end delay. However, in a network consisting of low-speed

interfaces, serialization delay can contribute significantly to the overall end-to-end delay.

Managing Delay While Maximizing Bandwidth Utilization

Given that the only component of end-to-end delay that you can actually control is queuingdelay, support for differentiated service classes is based on managing the queuing delay

experienced by different traffic classes during periods of network congestion. In the absence ofactive queue management techniques, such as Random Early Detection (RED), there is a directrelationship between the bandwidth utilization on a link and the RTT delay. If you maintain a

5-minute weighted bandwidth utilization of 10 percent, there will be minimal packet loss andminimal RTT delay, because the output ports are generally underused. However, if you

increase the 5-minute weighted bandwidth utilization to approximately 50 percent, theaverage RTT starts to increase exponentially as the load on your network increases. (Figure 5.)

DS-1 DS-3 OC-3 OC-12 OC-48 OC-192

40 byte 0.2073 ms 0.0072 ms 0.0021 ms 0.0005 ms 0.0001 ms < 0.0001 ms

256 byte 1.3264 ms 0.0458 ms 0.0132 ms 0.0033 ms 0.0008 ms 0.0002 ms







15/35



Figure 5: Bandwidth Utilization vs. Round-trip Time (RTT) Delay

The challenge when trying to manage delay is that, at the same time, you also need to

maximize bandwidth utilization in your network for financial reasons. Bear in mind thatbandwidth utilization statistics are meaningful only when the length of the circuit observation

period is specified. If you measure bandwidth utilization over one nanosecond (0.000000001sec.), you get one of two values: zero percent or 100 percent utilization. If you measure the

utilization of a circuit over 5 minutes, you get a reasonably damped average. Whenever wediscuss bandwidth utilization here, we always mean a 5-minute weighted average.

A 5-minute weighted bandwidth utilization of 50 percent doesnt meanjust 50 percentutilization. It means that there are short, sub-second intervals when utilization is close to 100

percent, queues fill up, and packets are dropped. It also means that there are other periodswhen the bandwidth utilization is close to zero percent, queue depth is zero, and packets arenever dropped. A 5-minute weighted average utilization of 50 to 60 percent is considered

heavy bandwidth utilization. If financial factors compel you to drive your utilization up to 70or 75 percent, then you dramatically increase the RTT delay and the variation in RTT delay forall applications running across your network.

So your dilemma is how to optimize the bandwidth utilization of your network while also

managing queuing delays for delay-sensitive traffic. To find the solution, you must firstdetermine which applications in your network can cope with increasing delay and delay

variation. TCP-based applications are specifically designed to be rate-adaptive and to cope

with delay, but there are other types of applications, such as real-time voice, that are unable tooperate smoothly when experiencing long delays or delay variation.

Therefore, the solution to optimizing bandwidth utilization while also managing queuing

delays is to isolate the applications that cannot handle delay from the 50 to 60 percentutilization class. You can accomplish this by placing packets from those applications into adedicated queue that does not experience the aggregate delay caused by the high utilization of

00%

5-minute Weighted Bandwidth Utilization

RTT

Delay

25% 50% 75% 100%


16/35



the circuit. In effect, you identify a certain set of applications, isolate those applications from

other types of traffic by placing them into a dedicated queue, and then control the amount ofqueuing delay experienced by those specific applications.

There are three things that you need to keep in mind with respect to delay in your network:

I In a well-designed and properly functioning network, queuing delay should be zero when

measured over time. There will always be extremely short periods of congestion, butnetwork links need to be properly provisioned. Otherwise, queuing delay will increase

rapidly, because you are asking that too much traffic cross an underprovisioned link.

I If you examine the relative impact of the factors other than queuing delay (forwarding,

propagation, and serialization) that contribute to delay, propagation delay is the majorsource of delay by several orders of magnitude.

I The only delay factor that you can control is queuing delay. The challenge with the otherfactors is that you have no real control over them.

Jitter

Jitter is the variation in delay over time experienced by consecutive packets that are part of thesame flow. (See Figure 6.) You can measure jitter by using a number of different techniques,including the mean, standard deviation, maximum, or minimum of the interpacket arrival

times for consecutive packets in a given flow.

Figure 6: Jitter Makes Packet Spacing Uneven

TDM systems can cause jitter, but the variation in delay is so small that, for all practical

purposes, you can ignore it. In a statistically multiplexed network, the primary source of jitteris the variability of queuing delay over time for consecutive packets in a given flow. Another

potential source of jitter is that consecutive packets in a flow may not follow the same physicalpath across the network due to equal-cost load balancing or routing changes.

Jitter increases exponentially with bandwidth utilization, just like delay. You can see this byexecuting a number of pings across a highly used link. You will notice not only an increase in

delay, but also an increase in the variation of delay.There are a couple of other considerations relevant to jitter in statistical multiplexing networks:

I In statistically multiplexed networks, the end-to-end jitter is never constant. This is becausethe level of congestion in a network always changes from place to place and moment to

moment. Unless you are assured that the transmission of a packet will begin immediatelyafter a router s forwarding decision, the amount of delay introduced at each hop in anend-to-end path is variable.

Network

Constant Flow of Packets Packets Arrive Unevenly Spaced

Source Destination

PC PC


17/35



I ATM has traditionally supported real-time traffic by using 53-byte cells as a way to place an

upper bound on the amount of delay that a cell is subject to at any single network node.The point is that a 53-byte time period is a lot less than a 1500-byte time period.

Impact of Jitter on Perceived QoS

Some applications are unable to handle jitter.

I With interactive voice or video applications, jitter can result in a jerky or uneven quality tothe sound or image. The solution is to properly provision the network, including the queue

scheduling discipline, and to condition traffic so that jitter stays within acceptable limitsThe jitter that remains can be handled by a short playback buffer on the destination host

that buffers packets briefly before playing them back as a smoothed data stream.

I For emulated TDM service over a statistical multiplexed network, jitter outside of a

narrowly defined range can introduce errors. The solution is to properly provision thenetwork, including priority queuing, and to condition traffic at the edges of the network so

that jitter stays within a predefined range.

However, there are other types of applications (such as those that run over TCP/IP) for which

jitter is not a problem. Also, for non-interactive applications, such as streaming voice or video,jitter does not present serious problems, because it can be overcome by using large playbackbuffers.

Loss

There are three sources of packet loss in an IP network, as illustrated in Figure 7:

I A break in a physical link that prevents the transmission of a packet,

I A packet that is corrupted by noise and is detected by a checksum failure at thedownstream node, and

I Network congestion that leads to buffer overflow.

Figure 7: Sources of Packet Loss in IP Networks

Breaks in physical links do occur, but they are rare, and the combination of self-healing

physical layers and redundant topologies respond dynamically to this source of packet loss.With the exception of wireless networking, when using modern physical layer technologies,the chance of packet corruption is statistically insignificant, so you can ignore this source of

packet loss also.

Consequently, the primary reason for packet loss in a non-wireless IP network is due to buffer

overflow resulting from congestion. The amount of packet loss in a network is typicallyexpressed in terms of the probability that a given packet will be discarded by the network.

Buffer OverflowBroken Link Corruption


18/35



IP networks do not carry a constant load, because traffic is bursty, and this causes the load on

the network to vary over time. There are periods when the volume of traffic that the network isasked to carry exceeds the capacity of some of the components in the network. When thisoccurs, congested network nodes attempt to reduce their load by discarding packets. When the

TCP/IP stack on host systems detects a packet loss, it assumes that the packet loss is due tocongestion somewhere in the network.

Packet Loss Can Be Good

It is important to understand that packet loss in an IP network is not always a bad thing. EachTCP session seeks all of the bandwidth that it can for its flow, but it must find the maximum

bandwidth without causing sustained congestion in the network. TCP accomplishes this by

transmitting slowly at the beginning of each session (slow-start), and then increasing thetransmission rate until it eventually detects the loss of a packet. Since TCP understands that a

packet drop means congestion is present at some point in the network, it reacts to thecongestion by temporarily reducing the transmission rate of the flow. Given enough time, each

TCP flow will eventually settle on the maximum bandwidth it can get across the networkwithout experiencing sustained congestion. When multiple TCP flows do this in parallel, theresult is fairness for all TCP sessions across the network. Thus, occasional packet loss is good,

because each TCP session needs to experience some amount of packet loss to find all of thebandwidth that it can to handle its flow.

Host response to network congestion is the same whether your IP network runs over packetsor over an ATM infrastructure, because TCP congestion-avoidance mechanisms are executed at

the transport layer, not the data link layer. An ATM transport does not possess remarkableproperties that allow you to better control the amount of traffic that a host injects into your

network, because the applications are native-IP-based, not ATM-based. If you need to controlend-system behavior, you are still required to perform traffic policing or shaping at the ingress

edges of your network. ATM can only support this at a relatively coarse level, because it is notaware of TCP/IP or the operation of its congestion-avoidance mechanisms. In fact, runningTCP/IP over an ATM infrastructure has a number of well-known limitations (cell tax, the

number of routing adjacencies required, the inability to identify IP packets in the core of the

network without reassembly, and so forth) that may actually obscure congestion-avoidanceissues, because there are more network layers that can hide this problem.

Mindful that a certain amount of packet loss is to be expected in any IP network, how can you

support differentiated service classes for specific customers or applications by arranging forsome packets to be treated differently from other packets with respect to packet loss? Assume

that you offer a fixed amount of bandwidth between two points in your network. As long asthe total amount of traffic sent along the path between these two points is less than or equal to

the agreed-upon throughput, there should be minimal packet loss after TCP sessions stabilize.This assumption allows us to support the differentiated treatment of packets with respect toloss by deploying multiple queues on each port, rather than just a single FIFO queue. The

output traffic stream is first classified, and then different types of packets are placed intodifferent queues. Finally, each queue is given a different share of the ports bandwidth. As long

as the amount of traffic placed into each of the queues is less than or equal to the agreed-uponbandwidth for the particular queue, then each queue should experience minimal packet loss

after the TCP sessions traversing the queue stabilize. (See Figure 8.)


19/35



Figure 8: Multiple Queues with Different Shares of a Ports Bandwidth

But what do you do if the amount of traffic placed into a given service class exceeds its

agreed-upon throughput? This becomes a policy decision with a number of options to managethe traffic load:

I Drop packets that are out-of-profile.

I Mark the packet, and then forward it with an increased drop probability. If theout-of-profile packet experiences congestion at a downstream node, it can be dropped

before other in-profile packets are dropped.

I Queue the packet, and then use traffic conditioning tools to control its rate on egress

I Transmit an explicit congestion notification (ECN) by setting the congestion experienced(CE) bit in the header of packets sourced from ECN-capable transport protocols.

It is important to note that, up to this point, we have limited our discussion of packet loss to thecase when a queue becomes 100 percent full. This mechanism is known as tail drop queue

management, because packets are dropped from the logical back, or tail, of the queue. (SeeFigure 9.)

Figure 9: Tail-drop Queue Management

Tail-drop queue management is a simple algorithm that is easy to implement. However, it doesnot discard packets fairly, because it allows a poorly behaved, bursty stream to consume all of

a queues resources, causing packets from other well-behaved streams to be discarded becausethe queue is 100 percent full.

30% Bandwidth Queue

20% Bandwidth Queue

10% Bandwidth Queue

Source 1

Source N

Source 6

Source 5

Source 4

Source 3

Source 2

40% Bandwidth Queue

Interface

Scheduler

Classifier

Full Queue

HeadTail

Tail-drop


20/35



Although you should expect a limited amount of packet loss in any network, significant packet

loss due to sustained congestion adversely affects the operation of your network. Sustainedcongestion creates critical problems for your network in several ways:

I The exchange of routing information is disrupted, and this can lead to route instability.

I The network is no longer able to absorb bursts of traffic.

I New TCP sessions cannot be established.

I Each of the existing TCP sessions traversing a heavily congested link begin to experiencesome amount of packet loss. Therefore, since packets from different sessions areinterleaved, all of the sessions begin to experience packet loss at roughly the same time.

This causes each of the individual sessions to go into slow-start, which creates aphenomenon known asglobal TCP synchronization. When this occurs, all of the TCP sessions

across the congested link become synchronized, resulting in periodic surges of traffic. As aresult, the link alternates between heavy congestion, as each TCP begins to seek its

maximum bandwidth, and light use, as all of the TCPs return to slow-start when they beginto experience congestion. This cycle repeats itself over and over again. Depending on wherethe congestion occurs in your network, this phenomenon can involve hundreds, thousands,

or even tens of thousands of TCP sessions.Random Early Detection (RED) is an active queue management mechanism that combats theproblem of global TCP synchronization, while also introducing a degree of fairness into thediscard-selection process.

A Brief History of Differentiated Services in Large IP Networks

The notion of providing more than just a single best-effort class of service has been part of the

IP architecture for more than 20 years. In this section, we examine some of the historicalapproaches to supporting differentiated service classes in large IP networks.

The First Approach: RFC 791In September 1981, RFC 791 standardized the Internet Protocol and reserved the second byte of

the IP header as the type of service (ToS) field. The bits of the ToS byte were defined as Figure10 shows.

Figure 10: RFC 791 Bit Definitions of ToS Bytes

The first three bits in the ToS byte (precedence bits) could be set by a node to select the relativepriority or precedence of the packet. The next three bits could be set to specify normal or low

delay (D), normal or high throughput (T), and normal or high reliability (R). The final two bitsof the ToS byte were reserved for future use. However, very little architecture was provided to

support the delivery of differentiated service classes in IP networks using these capabilities.

Precedence D T R Reserved

0 1 5 6 7432


21/35



The only application of the IP precedence bits until the mid-1990s was to support a feature

known as selective packet discard (SPD). SPD set the precedence bits for control packets(link-level keepalives, routing protocol keepalives, and routing protocol updates) so that, if thenetwork experienced congestion, critical control traffic would be the last to be discarded. The

goal was to enhance network stability during periods of congestion. In practice, the DTR bitswere never used.

The Second Approach: The Integrated Services Model (IntServ)

Around 1993, comprehensive work began in the IETF to develop a mechanism that would

allow IP to support more than a single best-effort class of service. The goal was to providereal-time service simultaneously with traditional non-real-time service in a shared IP network.

This work resulted in the development of the Integrated Services (IntServ) architecture. TheIntServ architecture is based onper-flow resource reservation.

IntServ Architecture

The IntServ architecture defined a reference model that specifies a number of different

components and the interplay among these components:I The resource reservation setup protocol (RSVP) that allows individual applications to

request resources from routers and then install per-flow state along the path of the packetflow.

I Two new service modelsguaranteed service and controlled load service. Guaranteedservice provides firm assurances (through strict admission control, bandwidth allocation,

and fair queuing) for applications that require guaranteed bandwidth and delay. Thecontrolled load service does not provide guaranteed bounds on bandwidth or delay, and

emulates a lightly loaded, best-effort network.

I Flow specifications that provide a syntax that allows applications to specify their specific

resource requirements.

I A packet classification process that examines incoming packets and decides which of thevarious classes of service should be applied to each packet.

I An admission control process that determines whether a requested reservation can besupported, based on the availability of both local and network resources.

I A policing and shaping process that monitors each flow to ensure that it conforms to itstraffic profile.

I A packet scheduling process that distributes network resources (buffers and bandwidth)among the different flows.

The IntServ model requires that source and destination hosts exchange RSVP signalingmessages to establish packet classification and forwarding state at each node along the path

between them. (See Figure 11.)


22/35



Figure 11: Resource Reservation Protocol (RSVP)

While people in the industry learned a tremendous amount during the development of IntServarchitecture, they eventually concluded that IntServ was not a suitable mechanism to support

the delivery of differentiated service classes in large IP networks.

I IntServ is not scalable, because it requires significant amounts of per-flow state and packetprocessing at each node along the end-to-end path. In the absence of state aggregation, theamount of state that needs to be maintained at each node scales in proportion to the

number of simultaneous reservations through a given node. The number of flows on a

high-speed backbone link could potentially range from tens of thousands to over a million.I IntServ requires that applications running on end systems support the RSVP signaling

protocol. There were very few operating systems that supported an RSVP API that

application developers could access.

I IntServ requires that all nodes in the network path support the IntServ model. This includes

the ability to the map IntServ service classes to link-layer technologies.

While the IntServ model failed, it led to the development and deployment of RSVP, which we

now use as a general-purpose signaling protocol for MPLS traffic engineering, fast LSPrestoration, and the rapid provisioning of optical links (GMPLS or MPLambdaS). RSVP

performs very well as a signaling protocol for MPLS because, in this application, it does notexperience the scalability problems associated with IntServ.

An IntServ Enhancement: Aggregation of RSVP Reservations

As discussed above, one of the major scalability limitations of RSVP is that it does not have the

ability to aggregate individually reserved sessions into a single, shared class. In September2001, RFC 3175 ("Aggregation of RSVP for IPv4 and IPv6 Reservations") defined procedures

that allow a single RSVP reservation to aggregate other RSVP reservations across a large IPnetwork. It proposed mechanisms to dynamically establish the aggregate reservation, identify

the specific traffic for which the aggregate reservation applies, determine how muchbandwidth is required to satisfy the reservation requirement, and reclaim bandwidth when thesubreservations are no longer required.

RFC 3175 enhances the scalability of RSVP for use in large IP networks by:

I Reducing the number of signaling messages exchanged and the amount of reservation state

that needs to be maintained by making a limited number of large reservations, rather than alarge number of small, flow-specific reservations,

I Streamlining the packet classification process in core routers by using the Differentiated

Services codepoint, or DSCP (see the discussion of DiffServ that follows), to identify anaggregated flow, instead of the traditional RSVP flow classification mechanism, and

I Simplifying packet queuing and scheduling by combining the aggregated streams into thesame queue on an output port.

Source Destination

PC

PCPath

Resv

Path

Resv

Path

Resv

Path

Resv

Path

Resv

RSVP RSVPRSVPRSVP RSVPRSVP


23/35



Among the potential applications for aggregation of RSVP reservations are these three:

I Interconnection of PSTN-call gateways across a provider backbone,

I Aggregation of RSVP paths at the edges of a provider network, and

I Aggregation of RSVP paths across the core of a provider network.

One of the strengths of RSVP is that it supports admission control on a per-flow basis. This canbe a powerful tool when supporting premium interactive voice services. Assume that you

establish an aggregated RSVP reservation to support 1000 voice calls. As long as there arefewer than 1000 active calls, a new call will be accepted by admission control, which will

allocate adequate bandwidth to support subscriber performance requirements. The 1001st callwill be denied access by admission control, thus preserving the quality of service delivered tothe 1000 established calls.

As you will see in the next section, the DiffServ model performs admission control on a

per-packet basis, not on a per-flow basis. This means that, at the edge of a DiffServ domain,calls 1001 through 1100 will be accepted but, because the service class is now out-of-profile,packets will be randomly dropped, thereby impacting the quality of service delivered for all of

the calls. You can overcome this feature of DiffServ by using a combination of aggregated

RSVP at the edges of the network to perform per-flow admission control for a voice gatewayplus DiffServ in the core of the network to support application performance requirementsacross the backbone.

The Third Approach: The Differentiated Services Model (DiffServ)

Around 1995 or 1996, service providers and various academic institutions began to examine

alternative approaches to supporting more than a single best-effort class of service, but thistime by using mechanisms that could provide the requisite scalability. As discussed in theprevious section, the failure of the IntServ model was due to the signaling explosion and the

amount of per-flow state that needed to be maintained at each node in the packet-forwardingpath. As a result, all of these new proposals sought to prevent these scalability issues. Figure 12

illustrates the cost, relative to complexity, of the new approaches to supporting differentiated

service classes.

Figure 12: Cost Relative to Complexity of Differentiated Services Solutions

At that time, there were a number of different proposals to redefine the meaning of the three

precedence bits in the ToS byte of the IP header. The proposals ranged from using a single bit,similar to the Frame Relay DE bit, to arbitrary bit definitions and even hybrid approaches,

where some bits were used for certain functions and the remaining bits were used for other

Increasing

Cost

Increasing Complexity

Best-effort

IntServ

DiffServ


24/35



functions. There was a lot of talk, some vendor code, but never any real production

deployment. The lack of successful deployment was because routers were software-based, andany attempt to make the packet forwarding process more complicated affected forwardingperformance, so it was simply easier to overprovision congested links.

By 1997, the IETF realized that IntServ was not going to be deployed in production networks,

and that the commercial sector had been thinking about supporting differentiated serviceclasses for specific customers or applications in a more coarse-grained and more scalable way

by using the IP precedence bits. As a result, the IETF created the DiffServ Working Group,

which met for the first time in March 1998. The goal of this group was to create relativelysimple and coarse methods of providing differentiated classes of service for Internet traffic, to

support various types of applications and specific business models.

The IETF Architecture for Differentiated Services

The DiffServ Working Group has changed the name of the IPv4 ToS octet to the DS byte and

defined new meanings for each of the bits. (See Figure 13.) The new specification for the DSField is applied to both the IPv4 ToS octet and the IPv6 traffic class octet, so that they use a

common set of mechanisms to support the delivery of differentiated service classes.

Figure 13: Differentiated Services Field (DS Field)

The IETFs DiffServ Working Group divides the DS byte into two subfields:

I The six high-order bits are known as the Differentiated Services codepoint (DSCP). TheDSCP is used by a router to select the per-hop behavior (PHB) that a packet experiences at

each hop within a Differentiated Services domain. PHB is an externally observableforwarding treatment applied to all packets that belong to the same service class or

behavior aggregate (BA).

I The two low-order bits are currently unused (CU) and reserved for future use. These two

bits are presently set aside for use by the explicit congestion notification (ECN) experiment.The values of the CU bits are ignored by each node when it determines the PHB to apply to

a packet.

The complete DiffServ architecture, defined in RFC 2475, is based on a relatively simple model,

whereby traffic that enters a network is first classified and then possibly conditioned at the

edges of the network. Depending on the result of the packet classification process, each packetis associated with one of the BAs supported by the Differentiated Services domain. The BA thateach packet is assigned to is indicated by the specific value carried in the DSCP bits of the DS

Field. When a packet enters the core of the network, each router along the transit path appliesthe appropriate PHB, based on the DSCP carried in the packets header. It is this combinationof traffic conditioning (policing and shaping) at the edges of the network, packet marking at

the edges of the network, local per-class forwarding behaviors in the interior of the network,and adequate network provisioning that allow the DiffServ model to support scalable service

discrimination across a common IP infrastructure.

0 1 5 6 7432

Differentiated Services Codepoint (DSCP) CU


25/35



Differentiated Services Domain (DS Domain)

A Differentiated Services domain (DS domain) is a contiguous set of routers that operate withcommon sets of service provisioning policies and PHB group definitions. (Figure 14.) A DSdomain is typically managed by a single administrative authority that is responsible for

ensuring that adequate network resources are available to support the service level

specifications (SLSs) and traffic conditioning specifications (TCSs) offered by the domain.

Figure 14: Differentiated Services Domain (DS Domain)

A DS domain consists of DS boundary nodes and DS interior nodes.

I DS boundary nodes sit at the edges of a DS domain. DS boundary nodes function as both

DS ingress and egress nodes for different directions of traffic flows. When functioning as a

DS ingress node, a DS boundary node is responsible for the classification, marking, andpossibly conditioning of ingress traffic. It classifies each packet, based on an examination ofthe packet header, and then writes the DSCP to indicate one of the PHB groups supportedwithin the DS domain. When functioning as a DS egress node, the DS boundary node may

be required to perform traffic conditioning functions on traffic forwarded to a directlyconnected peering domain. DS boundary nodes connect a DS domain to another DS

domain or another non-DS-capable domain.

I DS interior nodes select the forwarding behavior applied to each packet, based on an

examination of the packets DSCP (they honor the PHB indicated in the packet header). DSinterior nodes map the DSCP to one of the PHB groups supported by all of the DS interior

nodes within the DS domain. DS interior nodes connect only to another DS interior node orboundary node within the same DS domain.

Differentiated Service Router Functions

Figure 15 provides a logical view of the operation of a packet classifier and traffic conditioner

on a DiffServ-capable router.

DS Egress

Boundary Node

DS Ingress

Boundary Node

DS Interior

Node

DS Interior

Node

BA Traffic

Microflows


26/35



Figure 15: Packet Classifier and Traffic Conditioner

Packet Classification

A packet classifier selects packets in a traffic stream based on the content of fields in the packetheader. The DiffServ architecture defines two types of packet classifiers:

I A behavior aggregate (BA) classifier selects packets based on the value of the DSCP only.

I A multifield (MF) classifier selects packets based on a combination of the values of one or

more header fields. These fields can include the source address, destination address, DSField, protocol ID, source port, destination port, or other information, such as the incoming

interface. The result of the classification is written to the DS Field to simplify the packetclassification task for nodes in the interior of the DS domain.

After the packet classifier identifies packets that match specific rules, the packet is directed to alogical instance of a traffic conditioner for further processing.

Traffic ConditioningA traffic conditioner may consist of various elements that perform traffic metering, marking,shaping, and dropping. A traffic conditioner is not required to support all of these functions.

I A meter measures a traffic stream to determine whether a particular packet from the streamis in-profile or out-of-profile. The meter passes the in-profile or out-of-profile stateinformation to other traffic conditioning elements so that different conditioning actions can

be applied to in-profile and out-of-profile packets.

I A marker writes (or rewrites) the DS Field of a packet header to a specific DSCP, so that thepacket is assigned to a particular DS behavior aggregate.

I A shaper delays some or all packets in a traffic stream to bring the stream into conformancewith its traffic profile.

I A dropper (policer) discards some or all packets in a traffic stream to bring the stream intoconformance with its traffic profile.

Meter

MarkerShaper/

Dropper

Packets Packet

Classifier

Traffic Conditioner


27/35



Differentiated Services Router Functions

Figure 16 illustrates the functions that are typically performed by DS boundary routers and DSinterior routers.

Figure 16: DiffServ Router Functions

The DS ingress boundary router generally performs MF packet classification and trafficconditioning functions on incoming microflows. A microflow is single instance of an

application-to-application flow that is ultimately assigned to a behavior aggregate. A DSingress boundary router can also apply the appropriate PHB, based on the result of this packet

classification process.

NOTE A DS ingress boundary router may also perform BA packet classification if it trusts anupstream DS domains packet classification.

A DS interior router usually performs BA packet classification to associate each packet with a

behavior aggregate. It then applies the appropriate PHB by using specific buffer-managementand packet-scheduling mechanisms to support the specific packet-forwarding treatment.

Although the DiffServ architecture assumes that the majority of complex packet classificationand conditioning occurs at DS boundary routers, the use of MF classification is also supportedin the interior of the network.

The DS egress boundary router normally performs traffic shaping as packets leave the DSdomain for another DS domain or non-DS-capable domain. A DS egress boundary router mayalso perform MF or BA packet classification and precedence rewriting if it has an agreementwith a downstream DS domain.

DS Egress

Boundary Router

DS Ingress

Boundary Router

DS Interior

Router

DS Interior

Router

BA Traffic

Microflows

MF Classification

Traffic Metering

Packet Marking

Traffic Shaping/Dropping

BA Classification

PHB Support

Traffic Metering

Packet Marking

Traffic Shaping/Dropping


28/35



Per-hop Behaviors (PHBs)

A per-hop behavior (PHB) is a description of the externally observable forwarding behaviorapplied to a particular behavior aggregate. The PHB is the means by which a DS node allocatesits resources to different behavior aggregates. The DiffServ architecture supports the delivery

of scalable service discrimination, based on this hop-by-hop resource allocation mechanism.

PHBs are defined in terms of the behavior characteristics that are relevant to a providers

service provisioning policies. A specific PHB may be defined in terms of:

I The amount of resources allocated to the PHB (buffer size and link bandwidth),

I The relative priority of the PHB compared with other PHBs, or

I The observable traffic characteristics (delay, jitter, and loss).

However, PHBs are not defined in terms of specific implementation mechanisms.

Consequently, a variety of different implementation mechanisms may be acceptable forimplementing a specific PHB group.

The IETF DiffServ Working Group has defined two PHBs:

IExpedited forwarding PHB

I Assured forwarding PHB

In the future, new DSCPs can be assigned by a provider for its own local use or by newstandards activity.

Expedited Forwarding (EF PHB)

According to the IETFs DiffServ Working Group, the Expedited Forwarding (EF) PHB isdesigned to provide low loss, low delay, low jitter, assured bandwidth, end-to-end service.In effect, the EF PHB simulates a virtual leased line to support highly reliable voice or video

and to emulate dedicated circuit services. The recommended DSCP for the EF PHB is 101110.

Since the only aspect of delay that you can control in your network is the queuing delay, you

can minimize both delay and jitter when you minimize queuing delays. Thus, the intent of theEF PHB is to arrange that suitably marked packets encounter extremely short or empty queues

to ensure minimal delay and jitter. You can achieve this only if the service rate for EF packetson a given output port exceeds the usual rate of packet arrival at that port, independent of the

load on other (non-EF) PHBs.

The EF PHB can be supported on DS-capable routers in serveral ways:

I By policing EF microflows to prescribed values at the edge of the DS domain (this isrequired to ensure that the service rate for EF packets exceeds their arrival rate in the core

of the network),

I By ensuring adequate provisioning of bandwidth across the core of your network,

I By placing EF packets in the highest strict-priority queue and ensuring that the minimum

output rate is at least equal to the maximum input rate, or

I By rate-limiting the EF aggregate load in the core of your network to prevent inadequate

bandwidth for other service classes.

Generally, you will not use RED as a queue memory-management mechanism when

supporting the EF PHB, because the majority of the traffic is UDP-based, and UDP does notrespond to packet drops by reducing its transmission rate.


29/35



Assured Forwarding (AF PHB)

The Assured Forwarding (AF) PHB is a group of PHBs designed to ensure that packets areforwarded with a high probability of delivery, as long as the aggregate traffic in a forwarding

class does not exceed the subscribed information rate. If ingress traffic exceeds its subscribedinformation rate, then out-of-profile traffic is not delivered with as high a probability as traffic

that is in-profile.The AF PHB group includes four traffic classes. Packets within each AF class can be marked

with one of three possible drop-precedence values. The AF PHB group can be used toimplement Olympic-style service that consists of three service classes: gold, silver, and bronze.If you wish, you can further differentiate packets within each class by giving them either low,

medium, or high drop precedence within the service class. Table 2 summarizes therecommended DSCPs for the four AF PHB groups.

Table 2: Recommended AF DiffServ Codepoint (DSCP) Values

The AF PHB groups have not been assigned specific service definitions by the DiffServ

Working Group. The groups can be viewed as the mechanism that allows a provider to offerdifferentiated levels of forwarding assurances for IP packets. It is the responsibility of each DSdomain to set the quantitative and qualitative differences between AF classes.

In a DS-capable router, the level of forwarding assurance for any given packet depends on:

I The amount of bandwidth and buffer space allocated to the packets AF class,

I The amount of congestion for the AF class within the router, and

I The drop precedence of the packet.

The AF PHB group can be supported on DS-capable routers by:

I Policing AF microflows to prescribed values at the edge of the DS domain,

I Ensuring adequate provisioning of bandwidth across the core of your network,

I Placing each AF service class into a separate queue,

I Selecting the appropriate queue scheduling discipline to allocate buffer space and

bandwidth to each AF service class, and

I Configuring RED to honor the three low-order bits in the DSCP to determine how

aggressively a packet is dropped during periods of congestion.

Default PHB

RFC 1812 specifies the default PHB as the conventional best-effort forwarding behavior. Whenno other agreements are in place, all packets are assumed to belong to this traffic aggregate. A

packet assigned to this aggregate may be sent into a network without following any specific

AF Class 1 AF Class 2 AF Class 3 AF Class 4

Low drop precedence 001010 010010 011010 100010

Medium drop precedence 001100 010100 011100 100100

High drop precedence 001110 010110 011110 100110


30/35



rules, and the network will deliver as many of these packets as possible, as soon as possible,

subject to other resource-policy constraints. The recommended DSCP for the default PHB is000000.

General Observations about Differentiated Services

In this section, we discuss general observations about the nature of the DiffServ architecture tohelp you understand what you can or should expect if you decide to deploy it. It is important

to maintain a healthy skepticism about DiffServ, because it does not provide a magic solutionthat has the ability to solve all of the congestion-related problems in your network.

DiffServ Does Not Create Free Bandwidth

Routers are statistical multiplexing devices; therefore, they can experience congestion when

the amount of traffic that needs to traverse a port exceeds the output ports capacity. Thisme

supporting differentiated service classes in large ip networks.pdf

Documents