supporting differentiated service classes in large ip networks.pdf
TRANSCRIPT
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
1/35
Juniper Networks, Inc.1194 North Mathilda AvenueSunnyvale, CA 94089 USA
408 745 2000 or 888 JUNIPERwww.juniper.net
Part Number:200019-001 12/01
Supporting Differentiated Service Classes
in Large IP Networks
Chuck Semeria
Technical Marketing Engineer
John W. Stewart III
Product Line Manager
White Paper
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
2/35
Copyright 2001, Juniper Networks, Inc.
Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Fundamentals of Differentiated Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Classic Time-division Multiplexing vs. Statistical Multiplexing . . . . . . . . . . . . . . . . . . 6Classic Time-division Multiplexing (TDM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Statistical Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Best-effort Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Differentiated Service Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
The Impact of Statistical Multiplexing on Perceived Quality of Service . . . . . . . . . . . . . . . 11Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Sources of Network Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Managing Delay While Maximizing Bandwidth Utilization . . . . . . . . . . . . . . . . 14
Jitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Impact of Jitter on Perceived QoS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Packet Loss Can Be Good . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
A Brief History of Differentiated Services in Large IP Networks . . . . . . . . . . . . . . . . . . . . . 20The First Approach: RFC 791 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20The Second Approach: The Integrated Services Model (IntServ) . . . . . . . . . . . . . . . . 21
IntServ Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21An IntServ Enhancement: Aggregation of RSVP Reservations . . . . . . . . . . . . . . 22
The Third Approach: The Differentiated Services Model (DiffServ) . . . . . . . . . . . . . . 23The IETF Architecture for Differentiated Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Differentiated Services Domain (DS Domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Differentiated Service Router Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Packet Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Traffic Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Differentiated Services Router Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Per-hop Behaviors (PHBs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Expedited Forwarding (EF PHB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Assured Forwarding (AF PHB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Default PHB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
General Observations about Differentiated Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30DiffServ Does Not Create Free Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30DiffServ Does Not Change the Speed of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30The Strictest Service Guarantees Will Be between Well-Known Endpoints . . . . . . . . 31
Support for Interprovider DiffServ Is a Business Issue . . . . . . . . . . . . . . . . . . . . . . . . . 31Providers Do Not Control All Aspects of the User Experience . . . . . . . . . . . . . . . . . . 31Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Acronym Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Requests for Comments (RFCs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Internet Drafts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Technical Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
3/35
Copyright 2001, Juniper Networks, Inc. 3
List of Figures
Figure 1: Classic Time-division Multiplexing (TDM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Figure 2: Statistical Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Figure 3: Classic TDM vs. Statistical Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Figure 4: End-to-end Delay Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 5: Bandwidth Utilization vs. Round-trip Time (RTT) Delay . . . . . . . . . . . . . . . . . . . 15Figure 6: Jitter Makes Packet Spacing Uneven . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Figure 7: Sources of Packet Loss in IP Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Figure 8: Multiple Queues with Different Shares of a Ports Bandwidth . . . . . . . . . . . . . . 19Figure 9: Tail-drop Queue Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Figure 10: RFC 791 Bit Definitions of ToS Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Figure 11: Resource Reservation Protocol (RSVP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Figure 12: Cost Relative to Complexity of Differentiated Services Solutions . . . . . . . . . . . 23Figure 13: Differentiated Services Field (DS Field) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Figure 14: Differentiated Services Domain (DS Domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Figure 15: Packet Classifier and Traffic Conditioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Figure 16: DiffServ Router Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
List of Tables
Table 1: Serialization DelayPacket Size vs. Port Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Table 2: Recommended AF DiffServ Codepoint (DSCP) Values . . . . . . . . . . . . . . . . . . . . . 29
http://hardware%20router%20wp.pdf/http://hardware%20router%20wp.pdf/ -
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
4/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc.
4
Executive Summary
This white paper is the introduction to a series of papers published by Juniper Networks, Inc.that describe the support of differentiated service classes in large IP networks. This overview
presents the motivations for deploying multiple service classes, the fundamentals of statisticalmultiplexing, and the impact of statistical multiplexing on the quality of service delivered by a
network in terms of packet throughput, delay, jitter, and loss. We also provide a brief history ofthe various approaches that have been proposed to support differentiated service classes, a
description of the IETF DiffServ architecture, and general observations about what you canexpect from the deployment of multiple service classes in your network. The other papers inthis series provide technical discussions of queue scheduling disciplines, queue memory
management, host TCP congestion-avoidance mechanisms, and other issues related to thedeployment of multiple service classes in your network.
Perspective
Service provider IP networks have traditionally supported only public Internet service.Initially, Internet applications (e-mail, remote login, file transfer, and Web access) were not
considered mission-critical and did not have specific performance requirements forthroughput, delay, jitter, and packet loss. As a result, a single best-effort class of service (CoS)
was adequate to support all Internet applications.
However, the commercial success of the Internet has caused all of this to change, thus affecting
service providers in several ways.
I
Your IP network is now the single largest consumer of bandwidth, or at least is growing
toward this trend.
I
Your networks 24/7 availability and reliability are even more imperative. Internet serviceshave become mission-critical. For some organizations, such as online retailers or stockmarkets, the cost of an hour-long network outage can be extremely expensive.
I
You need to differentiate your company from the competition by offering a range of serviceclasses with service-level agreements (SLAs) that are specifically tailored to meet your
customers and their customers requirements.
I
You want to offer better classes of service to your premium customers and charge more forthose services.
I
You are probably considering offering services such as voice-over-IP (VoIP) or virtualprivate networks (VPNs) that have more rigid performance requirements than traditionalInternet applications.
You may also be considering deploying a variety of services over a shared IP infrastructure,
each of which has different performance requirements. In a multiservice IP network, IP routersrather than Frame Relay switches, ATM switches, or voice switches are used to access thetransmission network.
I
A larger service portfolio allows you to attract and keep new customers
I
Converged networks minimize your operating expenses, because you have fewer networksto manage.
I
A packet-based network maximizes bandwidth efficiency through the use of statisticalmultiplexing.
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
5/35
Copyright 2001, Juniper Networks, Inc. 5
Supporting Differentiated Service Classes in Large IP Networks
There are two fundamentally different approaches to supporting the delivery of multiple
service classes in large IP networks. One approach is simply to overprovision the network andthrow raw bandwidth at the problem. The other approach is to build a CoS-enabled backbone
based on bandwidth management.
Those who favor overprovisioning argue that:
I
The additional cost and complexity of managing traffic outweighs the gain it provides inbandwidth efficiency.
I
It is very difficult to monitor, verify, and account for multiple service classes in large IPnetworks.
I
You already have other CoS-enabled infrastructures (TDM and ATM) that you can use tosupport services that have strict performance requirements.
Those who favor bandwidth management argue that:
I
Bandwidth management allows you to optimize bandwidth utilization and run yournetwork at close to its maximum capacity.
I
New applications emerge, you deploy new networking equipment, and bandwidth arrives
in discrete chunks. These events rarely occur in a coordinated manner, and trafficmanagement allows you to control bandwidth and smoothly handle mismatches innetwork capacity as these transitions occur.
I
Bandwidth management allows you to increase your revenue by selling multiple serviceclasses over a shared infrastructure, such as a converged IP/MPLS backbone. A convergedinfrastructure allows you to reduce your operating expenses, to use a single access
technology, and to market a wide range of integrated products, such as Internet access,VPN access, and videoconferencing.
While the arguments for both of these approaches are convincing, the cost is roughly equal.Initially, the deployment of bandwidth management in your network involves simply enabling
specific router functions. However, there are a number of hidden training, operational, andmaintenance costs involved in successfully managing bandwidth in a production network.
Also, while it is relatively easy to understand how to manage bandwidth from an engineeringperspective, service providers have very little practical experience in supporting, debugging,
tuning, and accounting for multiple service classes in large IP networks. On the other hand, ifyou do not have the ability to throttle traffic to some degree, even a network of enormous
bandwidth can be overrun by misbehaving applications to a point that mission-critical anddelay-sensitive services are severely impacted.
Successful providers will adopt a solution that is based on a combination of overprovisioningbandwidth and MPLS traffic engineering to minimize the long-term average level of
congestion, while also deploying Integrated Services (IntServ) and Differentiated Services(DiffServ) to address the requirements of delay- and jitter-sensitive traffic during short-termperiods of congestion. It is only through a combination of technologies that you will be able to
support the delivery of differentiated service classes on a large scale and at a reasonable cost.
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
6/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc.
6
Fundamentals of Differentiated Services
To support business objectives that require multiple service classes, there is growing interest inthe mechanisms that make it possible to deliver differentiated traffic classes over a common IP
infrastructure. Because these mechanisms are widely misunderstood, we begin with adiscussion of some of the fundamental concepts that are relevant to the deployment of
differentiated service classes.
Classic Time-division Multiplexing vs. Statistical Multiplexing
Network transmission facilities are an expensive resource, as you know. Multiplexing can saveyou money by allowing many different data flows to share a common physical transmission
path, rather than requiring that each flow have a dedicated transmission path. There arecurrently two basic types of multiplexing used in data communications:
I
Time-division multiplexing (TDM)The transmission facility is divided into multiplechannels by allocating the facility to several different channels, one at a time.
I
Frequency-division multiplexing (FDM)The transmission facility is divided into multiple
channels by using different frequencies to carry different signals.
Within TDM, there are two methods of arbitrating bandwidth on an output port: the staticallocation of fixed-sized time slots and the dynamic allocation of variable-sized time slots.
Classic TDM devices switch traffic by using static arbitration to allocate input bandwidth to anequal amount of output bandwidth and by mapping traffic to a specific output time slot.
Packet switches use variable arbitration, with bandwidth allocated on demand on a per-packetbasis.
Classic Time-division Multiplexing (TDM)
Classic time-division multiplexing (TDM) is a technique that is applied to circuit-switched
networks. TDM assumes that data streams are organized into bits, bytes, or words rather thanpackets. Figure 1 illustrates the basic concept behind TDM.
Figure 1: Classic Time-division Multiplexing (TDM)
Although the following description is not the classic definition of TDM, it is sufficient to
provide a background for our discussion of differentiated service classes. At the ingress end ofthe shared link, the TDM multiplexer samples and then interleaves the five discrete input data
streams in a round-robin fashion, granting each stream the entire bandwidth of the shared linkfor a very short time. TDM guarantees that the bandwidth of the output link is never less thanthe sum of the rates of the individual input streams, because, at input, each unit of bandwidth
1
4
3
2
5
1
4
3
2
5
14 3 2514 3 25
MUX DEMUX
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
7/35
Copyright 2001, Juniper Networks, Inc. 7
Supporting Differentiated Service Classes in Large IP Networks
is mapped at configuration time to an equal-sized unit of bandwidth on the output link. At the
egress end of the shared link, the TDM demultiplexer processes the traffic and reconstructs thefive individual data streams.
There are two key features of classic TDM that are relevant to our discussion of supportingmultiple service classes:
I
First, it is not necessary to buffer data when input streams are multiplexed onto the sharedoutput link, because the capacity of the output link is always greater than or equal to the
sum of the rates of the individual input streams.
I
Second, classic TDM leads to an aggregate underutilization of bandwidth on an output
port. Assuming that you are transmitting packet data over a classic TDM system, eachinput channel consumes somewhere between zero percent and 100 percent of its available
bandwidth, depending on the burstiness of the application. If you examine the bandwidththat is not used and add this up for all of the channels in your system, you can achieve anoverall bandwidth utilization on the output port of only 10 to 15 percent, depending on the
specific behavior of your traffic.
Two common examples of classic TDM in large carrier or provider networks are:
I
A T-1 multiplexer with 28 T-1 circuits on the input side and one DS-3 circuit on the outputside, or
I
A SONET multiplexer with 4 OC-12c/STM-4s on the input side and one OC-48c/STM-16
on the output side.
Statistical Multiplexing
Statistical multiplexing is designed to support packet-switched networks by dynamicallyallocating variable-length time slots on an output port. Statistical multiplexing devices assume
that data flows are organized into packets, frames, or cells rather than bits, bytes, or words.Figure 2 illustrates the basic concept behind statistical multiplexing.
Figure 2: Statistical Multiplexing
Unlike classic TDM devices, a statistical multiplexing device does not
map each unit of inputbandwidth to an equal-sized unit of bandwidth on an output port. Statistical multiplexing
dynamically allocates bandwidth on an output port only to active input streams, making betteruse of the available bandwidth and allowing more streams to be transported across the shared
port than with other multiplexing techniques.
Stat Mux Device
21
3
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
8/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc.
8
A packet, frame, or cell arriving on one port of a statistical multiplexing device can potentially
exit from any other port of the device. The specific output port is determined by the result of alookup, based on the contents of the packet headera MAC address, a VPI/VCI, a DLCI, or anIP address. This means that there may be times when more packets, frames, or cells need to be
transmitted from a port than the given port has bandwidth to support. When this occurs, thestatistical multiplexing device places the oversubscribed packets, frames, or cells into a buffer
(queue) that is associated with the output port. The buffer absorbs packets during theextremely short periods of time when the output port experiences congestion.
Common examples of statistical multiplexing devices in large carrier or provider networksinclude:
I
IP routers,
I
Ethernet switches, and
I
Frame Relay switches.
Optimal Buffer Size
Determining the optimal size for a packet buffer is critical, because providing a packet buffer
that is too small is just as bad as providing a packet buffer that is too large.
I
Small packet buffers can cause packets from bursts to be dropped. This forces a host TCP to
reduce its transmission rate by returning to slow-start or congestion-avoidance mode. Thiscan severely reduce the sessions overall packet throughput rate.
I
Large packet buffers at each hop can cause the total round-trip time (RTT) to increase to apoint where packets that are waiting in buffers in the core of a network are retransmitted by
the source TCP even though they have not been dropped. A source TCP maintains aretransmission timer that it uses to decide when it should start retransmitting lost
packets if
it does not receive an ACK from the destination TCP.
Optimally, a router buffer needs to be large enough to absorb the burstiness of traffic flows but
small enough that the RTT remains relatively small, so that packets waiting in queues are not
mistakenly retransmitted.
The amount of memory that needs to be assigned to each queue is determined by the speed ofthe link, the behavior of the traffic, and the characteristics of the higher-layer transport protocol
that provides flow control. For a queue designed to support UDP-based, real-time applications,such as VoIP, a large packet buffer is not desirable, because it can increase end-to-end delay.
However, for a queue designed to support TCP-based applications, optimal performancerequires that the bandwidth-delay buffer size be calculated using the following formula:
Buffer_Size = (Port bandwidth) * (longest RTT flow forwarded across the port)
For example, the size of the buffer required to support a maximum round-trip delay of 100 ms
on an OC-48c/STM-16 port is ~32 MB.
Bandwidth Oversubscription
Voice networks have always been oversubscribed, in that dedicated bandwidth is not reservedfor each potential voice user. Carriers can overprovision their voice networks because there are
far more voice subscribers than there are voice calls at any given moment. Generally, it is easierto provision a voice network than a data network, because you have a much better
understanding of the call activity you expect to see at any time of the day than you do of theamount of data traffic your network will be required to transport at the same time. However,
we have all experienced situations when all circuits are busy during catastrophic events.
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
9/35
Copyright 2001, Juniper Networks, Inc. 9
Supporting Differentiated Service Classes in Large IP Networks
In packet-based networks, statistical multiplexing takes advantage of the fact that each host
attached to a network is not always active, and when it is active, data is transmitted in bursts.As a result, statistical multiplexing allows you to oversubscribe network resources and supporta greater number of flows than classic TDM using the same amount of bandwidth. This is
known as the statistical multiplexing gain
.
Figure 3 shows three hosts transmitting data. When the network uses classic TDM to access theoutput port, certain time slots remain empty, which causes the bandwidth of those time slots to
be wasted. In contrast, when the network uses statistical multiplexing to access the output
port, empty time slots are not transmitted, so this extra bandwidth can be used to support thetransmission of other statistically multiplexed flows.
Figure 3: Classic TDM vs. Statistical Multiplexing
Lets examine typical oversubscription numbers used by large service providers. Core links aretypically oversubscribed by a factor of 2X, while access links are generally oversubscribed by a
factor of 8X (8 times more than the potential capacity going into the core than the core cantransport). As long as the queues in the network usually
remain empty, the network willcontinue to provide satisfactory performance at these oversubscription levels. If the traffic
patterns in the network are well-understood, then it is possible to apply an oversubscriptionpolicy that ensures that queues do, in fact, usually remain empty. The oversubscription
capabilities supported by statistical multiplexing devices offer monetary savings. For example,an oversubscription policy of 20 percent allows packets from almost 23 E-3 circuits (775 Mbps)
to be aggregated onto a single OC-3/STM-1 circuit (155 Mbps).
Statistical Multiplexing and Multiple Service Classes
As a foundation to our discussion of differentiated service classes, there are two key features tokeep in mind regarding statistical multiplexing:
I
Statistical multiplexing requires packet buffering during transient periods of congestionwhen the output-port bandwidth is momentarily less than the sum of the rates of the input
flows seeking to use that bandwidth.
Flow 3
Flow 2
Flow 1
Multiplexer
Output Port
P1P3 P2 Bandwidth Utilization
Classic TDM
Statistical
MultiplexingExtra Bandwidth Available
Wasted Bandwidth
Statistical
Multiplexing
GainPass 1Pass 2Pass 3
Pass 1Pass 2Pass 3
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
10/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc.
10
I
Statistical multiplexing provides significantly better utilization of the output port
bandwidth than classic TDM. This enhanced utilization can be approximately four times(4X) greater than classic TDM, depending on the specific traffic flows. The higherutilization of output port bandwidth is the key benefit of statistical multiplexing when
compared with classic TDM.
Best-effort Delivery
IP routers perform statistical multiplexing because they are packet switches. The InternetProtocol is a datagram protocol, where each packet is routed independently of all other packets
without the concept of a connection. IP has traditionally offered only a single class of service,known as best-effort delivery, where all packets traversing the network were treated with the
same priority. Best-effort means that IP makes a reasonable effort to deliver each datagram toits destination with uncorrupted data, but there are no guarantees that a packet will not be
corrupted, duplicated, reordered, or misdelivered. Additionally, there are no promises withrespect to the amount of throughput, delay, jitter, or loss that a traffic stream will experience.
The network makes a best-effort attempt to satisfy its clients and does not arbitrarily discardpackets. However, best-effort service without the support of intelligent transport protocols
would lead to chaos. The only reason that best-effort works in global IP networks is becauseTCP does not compromise the network when it experiences congestion, but rather detects andthen responds smoothly to packet loss by reducing its transmission rate. TCP is the basic
building block that makes the best-effort queue the most well-behaved queue in a router,because it backs off when it experiences congestion.
Best-effort delivery is not a pejorative term. In fact, the ability to support a single best-effortservice has allowed large IP networks and the Internet to become what they are todaythe
unchallenged technology of choice for supporting mission-critical applications at a globalscale. However, there are a number of perceived issues related to IPs ability to support only a
single best-effort class of service and the potential impact on IPs continued commercialsuccess. Some carriers and providers see the need to offer multiple service levels if they are to
support the deployment of new services, each with different performance requirements, over a
shared IP infrastructure.
Differentiated Service Classes
Supporting multiple service classes for specific applications or customers is concerned with
treating packets that belong to certain data streams differently from packets that belong toother data streams. Multiple service classes are all about providing managed unfairness
tocertain traffic classes.
Differentiated service levels are supported by manipulating the key attributes of certain
streams to change the customers perception of the quality of service that the network isdelivering. These attributes include:
I
The amount of data that can be transmitted per unit of time (throughput),
I
The amount of time that it takes for data to be transmitted from one point to another point
in the network (delay or latency),
I
The variation in this delay over time (jitter) for consecutive packets in a given flow, and
I
The percentage of transmitted data that does not arrive at its destination correctly (loss).
However, the quality of service provided to a given service class can be only as good as thelowest quality of service delivered by the weakest link in the end-to-end path.
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
11/35
Copyright 2001, Juniper Networks, Inc. 11
Supporting Differentiated Service Classes in Large IP Networks
The concept of multiple service classes is not applicable to classic TDM services, because if a
TDM link is up, then bandwidth, delay, and jitter are constant, and the percent of packet loss iszero. Any errors that occur result from bandwidth going to zero, delay going to infinity, andloss going to 100 percent. For classic TDM services, the concept of differentiated service classes
involves providing different uptime commitments and meeting different customer servicerequirements, restoration times, and so forth.
The need to provide multiple service classes for customers or applications applies much moreto the delivery of statistically multiplexed services. This is because specific packet flows of
interest traverse several routers, and the quality of service
perceived by individual users is afunction of the way that statistical multiplexing is performed at each hop in the path, as well as
the characteristics of the individual links in the path. By treating some packets differently fromothers when performing statistical multiplexing, a network of routers can offer different kinds
of throughput, delay, jitter, and loss for different packet flows.
Finally, supporting differentiated service classes through bandwidth reservations or lower
oversubscription factors for higher-priority services results in a less efficient use of networkbandwidth than if you provide only a single best-effort statistical multiplexing service.However, you can compensate for your lower bandwidth efficiency by charging your
subscribers a premium for higher-priority services.
NOTE
Once you make the business decision to offer multiple levels of service, it is important to
perform the analysis that is necessary to determine exactly how much more you need to chargeyour subscribers to maintain your profit margins and to compensate for your loss of
bandwidth efficiency.
The Impact of Statistical Multiplexing on Perceived Quality of Service
In this section, we examine how the statistical multiplexing performed by routers can influence
the user's perception of the quality of service delivered by a network. The quality of service
attributes that can be affected by statistical multiplexing include:
I
Throughput,
I
Delay,
I
Jitter, and
I
Loss.
Throughput
Throughput
is a generic term used to describe the capacity of a system to transfer data. It is easyto measure the throughput for a TDM service, because the throughput is simply the bandwidth
of the transmission channel. For example, the throughput of a DS-3 circuit is 45 Mbps.However, for TCP/IP statistically multiplexed services, throughput is much harder to define
and measure, because there are numerous ways that it can be calculated, including:
I
The packet or byte rate across the circuit,
I
The packet or byte rate of a specific application flow,
I
The packet or byte rate of host-to-host aggregated flows, or
I
The packet or byte rate of network-to-network aggregated flows.
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
12/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc.
12
The most direct way that a router's statistical multiplexing can be tuned to affect throughput is
by the amount of bandwidth it allocates to different types of packets.
I
In classic best-effort service, the router does not specifically control the amount of
bandwidth assigned to different traffic classes. Instead, during periods of congestion, allpackets are placed into a single first-in, first-out (FIFO) queue. When faced with congestion,
User Datagram Protocol (UDP) flows continue to transmit at the same rate, but TCP flowsdetect and then react to packet loss by reducing their transmission rate. As a result, UDPflows end up consuming the majority of the bandwidth on the congested port, but each
TCP flow receives a roughly equal share of the leftover bandwidth.
I
When attempting to support differentiated treatment for different traffic classes, each class
of traffic can be given different shares of output-port bandwidth. For example, a router canbe configured to allocate different amounts of bandwidth to each class of traffic on the
output port, or one class of traffic can be given strict priority over all other classes, or oneclass of traffic can be given strict priority with a bandwidth limit (to prevent the starvation
of the other classes) over all other classes. The support of differentiated service classesimplies the use of more than just a single FIFO queue on each output port.
Delay
Delay
(or latency
) is the amount of time that it takes for a packet to be transmitted from one
point in a network to another point in the network. There are a number of factors thatcontribute to the amount of delay experienced by a packet as it traverses your network:
I
Forwarding delay,
I
Queuing delay,
I
Propagation delay, and
I
Serialization delay.
Figure 4 illustrates that the end-to-end delay can be calculated as the sum of the individual
forwarding, queuing, serialization, and propagation delays occurring at each node and link inyour network.
Figure 4: End-to-end Delay Calculation
However, when examining the causes of application delay in your network, it is important to
remember that the routers represent only a part of the end-to-end path and that you must alsoconsider several other factors:
Ingress
Router
Egress
Router
D (Node) D (Link)
D (Forwarding) D (Queuing) D(Serialization) D (Propagation)
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
13/35
Copyright 2001, Juniper Networks, Inc. 13
Supporting Differentiated Service Classes in Large IP Networks
I
The performance bottlenecks within hosts and servers
I
Operating system scheduling delays
I
Application resource contention delays
I
Physical layer framing delays
I
CODEC encoding, compression, and packetization delays
I The quality of the different TCP/IP implementations running on these end systems
I The stability of routing in the network
Sources of Network Delay
In this section, we examine each of the sources of delayforwarding, queuing, propagation,
and serialization delay.
Forwarding Delay
Forwarding delay is the amount of time that it takes a router to receive a packet, make a
forwarding decision, and then begin transmitting the packet through an uncongested outputport. This represents the minimum amount of time that it takes the router to perform its basicfunction and is typically measured in tens or hundreds of microseconds (0.000001 sec). Other
than deploying the industry standard in hardware-based routers, you have no real control overforwarding delay.
Queuing Delay
Queuing delay is the amount of time that a packet has to wait in a queue as the system
performs statistical multiplexing and while other packets are serviced before it can betransmitted on the output port. The queuing delay at a given router can vary over time
between zero seconds for an uncongested link, to the sum of the times that it takes to transmiteach of the other packets that are queued ahead of it. During periods of congestion, the queue
memory management and queue scheduling disciplines allow you to control the amount ofqueuing delay experienced by different classes of traffic placed in different queues.
Propagation Delay
Propagation delay is the amount of time that it takes for electrons or photons to traverse a
physical link. The propagation delay is based on the speed of light and is measured inmilliseconds (0.001 sec). When estimating the propagation delay across a point-to-point link,you can assume one millisecond (1 ms) of propagation delay per 100 mile (160 km) round-trip
distance. Consequently, the speed-of-light propagation RTT delay from San Francisco to NewYork (6000 mi, 9654 km) is between 60 to 70 ms (0.060 sec to 0.070 sec). Because you cant
change the speed of light in optical fiber, you have no control over propagation delay.
It is interesting to note that the speed of light in optical fiber is approximately 65 percent of the
speed of light in a vacuum, while the speed of electron propagation through copper is slightlyfaster, at 75 percent of the speed of light. Although the signal representing each bit travels
slightly faster in copper than in fiber, fiber has numerous advantages over copper, because itresults in fewer bit errors, supports longer cable runs between repeaters, and allows more bits
to be packed into a given length of cable. For example, a 10 Mbps copper interface (traditionalEthernet) transports 78 bits per mile (124 bits per km), resulting in a 1500-byte packet that is
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
14/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 14
154 miles (248 km) long. In contrast, a 2.488 Gbps fiber interface (OC-48c/STM-16) transports
19,440 bits per mile (31,104 bits per km), creating a 1500-byte packet that is only 3,260 feet (994m) long.
Serialization Delay
Serialization delay is the amount of time that it takes to place the bits of a packet onto the wirewhen a router transmits a packet. Serialization delay is measured in milliseconds (ms, or 0.001sec) and is a function of the size of the packet and the speed of the port. Since there is no
practical mechanism to control the size of the packets in your network (other than reducing theMTU or forcing packet fragmentation), the only action you can take to reduce serializationdelay is to install higher-speed router interfaces.
Table 1 displays the serialization delay for various packet sizes and different port speeds.
Table 1: Serialization DelayPacket Size vs. Port Speed
From Table 1, you can see that it takes 7.7 ms to place a 1500-byte packet on a DS-1 circuit. Thisis a significant amount of time if you consider that the typical one-way propagation delay from
San Francisco to New York (3000 mi, 4827 km) is between 30 and 35 ms. On the other hand, theserialization delay for a 1500-byte packet on an OC-192c/STM-64 port is only 0.0012 ms. In a
network consisting of high-speed interfaces, serialization delay contributes an insignificantamount to the overall end-to-end delay. However, in a network consisting of low-speed
interfaces, serialization delay can contribute significantly to the overall end-to-end delay.
Managing Delay While Maximizing Bandwidth Utilization
Given that the only component of end-to-end delay that you can actually control is queuingdelay, support for differentiated service classes is based on managing the queuing delay
experienced by different traffic classes during periods of network congestion. In the absence ofactive queue management techniques, such as Random Early Detection (RED), there is a directrelationship between the bandwidth utilization on a link and the RTT delay. If you maintain a
5-minute weighted bandwidth utilization of 10 percent, there will be minimal packet loss andminimal RTT delay, because the output ports are generally underused. However, if you
increase the 5-minute weighted bandwidth utilization to approximately 50 percent, theaverage RTT starts to increase exponentially as the load on your network increases. (Figure 5.)
DS-1 DS-3 OC-3 OC-12 OC-48 OC-192
40 byte 0.2073 ms 0.0072 ms 0.0021 ms 0.0005 ms 0.0001 ms < 0.0001 ms
256 byte 1.3264 ms 0.0458 ms 0.0132 ms 0.0033 ms 0.0008 ms 0.0002 ms
320 byte 1.6580 ms 0.0572 ms 0.0165 ms 0.0041 ms 0.0010 ms 0.0003 ms
512 byte 2.6528 ms 0.0916 ms 0.0264 ms 0.0066 ms 0.0016 ms 0.0004 ms
1500 byte 7.7720 ms 0.2682 ms 0.0774 ms 0.0193 ms 0.0048 ms 0.0012 ms
4470 byte 23.1606 ms 0.7994 ms 0.2307 ms 0.0575 ms 0.0144 ms 0.0036 ms
9180 byte 47.5648 ms 1.6416 ms 0.4738 ms 0.1181 ms 0.0295 ms 0.0074 ms
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
15/35
Copyright 2001, Juniper Networks, Inc. 15
Supporting Differentiated Service Classes in Large IP Networks
Figure 5: Bandwidth Utilization vs. Round-trip Time (RTT) Delay
The challenge when trying to manage delay is that, at the same time, you also need to
maximize bandwidth utilization in your network for financial reasons. Bear in mind thatbandwidth utilization statistics are meaningful only when the length of the circuit observation
period is specified. If you measure bandwidth utilization over one nanosecond (0.000000001sec.), you get one of two values: zero percent or 100 percent utilization. If you measure the
utilization of a circuit over 5 minutes, you get a reasonably damped average. Whenever wediscuss bandwidth utilization here, we always mean a 5-minute weighted average.
A 5-minute weighted bandwidth utilization of 50 percent doesnt meanjust 50 percentutilization. It means that there are short, sub-second intervals when utilization is close to 100
percent, queues fill up, and packets are dropped. It also means that there are other periodswhen the bandwidth utilization is close to zero percent, queue depth is zero, and packets arenever dropped. A 5-minute weighted average utilization of 50 to 60 percent is considered
heavy bandwidth utilization. If financial factors compel you to drive your utilization up to 70or 75 percent, then you dramatically increase the RTT delay and the variation in RTT delay forall applications running across your network.
So your dilemma is how to optimize the bandwidth utilization of your network while also
managing queuing delays for delay-sensitive traffic. To find the solution, you must firstdetermine which applications in your network can cope with increasing delay and delay
variation. TCP-based applications are specifically designed to be rate-adaptive and to cope
with delay, but there are other types of applications, such as real-time voice, that are unable tooperate smoothly when experiencing long delays or delay variation.
Therefore, the solution to optimizing bandwidth utilization while also managing queuing
delays is to isolate the applications that cannot handle delay from the 50 to 60 percentutilization class. You can accomplish this by placing packets from those applications into adedicated queue that does not experience the aggregate delay caused by the high utilization of
00%
5-minute Weighted Bandwidth Utilization
RTT
Delay
25% 50% 75% 100%
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
16/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 16
the circuit. In effect, you identify a certain set of applications, isolate those applications from
other types of traffic by placing them into a dedicated queue, and then control the amount ofqueuing delay experienced by those specific applications.
There are three things that you need to keep in mind with respect to delay in your network:
I In a well-designed and properly functioning network, queuing delay should be zero when
measured over time. There will always be extremely short periods of congestion, butnetwork links need to be properly provisioned. Otherwise, queuing delay will increase
rapidly, because you are asking that too much traffic cross an underprovisioned link.
I If you examine the relative impact of the factors other than queuing delay (forwarding,
propagation, and serialization) that contribute to delay, propagation delay is the majorsource of delay by several orders of magnitude.
I The only delay factor that you can control is queuing delay. The challenge with the otherfactors is that you have no real control over them.
Jitter
Jitter is the variation in delay over time experienced by consecutive packets that are part of thesame flow. (See Figure 6.) You can measure jitter by using a number of different techniques,including the mean, standard deviation, maximum, or minimum of the interpacket arrival
times for consecutive packets in a given flow.
Figure 6: Jitter Makes Packet Spacing Uneven
TDM systems can cause jitter, but the variation in delay is so small that, for all practical
purposes, you can ignore it. In a statistically multiplexed network, the primary source of jitteris the variability of queuing delay over time for consecutive packets in a given flow. Another
potential source of jitter is that consecutive packets in a flow may not follow the same physicalpath across the network due to equal-cost load balancing or routing changes.
Jitter increases exponentially with bandwidth utilization, just like delay. You can see this byexecuting a number of pings across a highly used link. You will notice not only an increase in
delay, but also an increase in the variation of delay.There are a couple of other considerations relevant to jitter in statistical multiplexing networks:
I In statistically multiplexed networks, the end-to-end jitter is never constant. This is becausethe level of congestion in a network always changes from place to place and moment to
moment. Unless you are assured that the transmission of a packet will begin immediatelyafter a router s forwarding decision, the amount of delay introduced at each hop in anend-to-end path is variable.
Network
Constant Flow of Packets Packets Arrive Unevenly Spaced
Source Destination
PC PC
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
17/35
Copyright 2001, Juniper Networks, Inc. 17
Supporting Differentiated Service Classes in Large IP Networks
I ATM has traditionally supported real-time traffic by using 53-byte cells as a way to place an
upper bound on the amount of delay that a cell is subject to at any single network node.The point is that a 53-byte time period is a lot less than a 1500-byte time period.
Impact of Jitter on Perceived QoS
Some applications are unable to handle jitter.
I With interactive voice or video applications, jitter can result in a jerky or uneven quality tothe sound or image. The solution is to properly provision the network, including the queue
scheduling discipline, and to condition traffic so that jitter stays within acceptable limitsThe jitter that remains can be handled by a short playback buffer on the destination host
that buffers packets briefly before playing them back as a smoothed data stream.
I For emulated TDM service over a statistical multiplexed network, jitter outside of a
narrowly defined range can introduce errors. The solution is to properly provision thenetwork, including priority queuing, and to condition traffic at the edges of the network so
that jitter stays within a predefined range.
However, there are other types of applications (such as those that run over TCP/IP) for which
jitter is not a problem. Also, for non-interactive applications, such as streaming voice or video,jitter does not present serious problems, because it can be overcome by using large playbackbuffers.
Loss
There are three sources of packet loss in an IP network, as illustrated in Figure 7:
I A break in a physical link that prevents the transmission of a packet,
I A packet that is corrupted by noise and is detected by a checksum failure at thedownstream node, and
I Network congestion that leads to buffer overflow.
Figure 7: Sources of Packet Loss in IP Networks
Breaks in physical links do occur, but they are rare, and the combination of self-healing
physical layers and redundant topologies respond dynamically to this source of packet loss.With the exception of wireless networking, when using modern physical layer technologies,the chance of packet corruption is statistically insignificant, so you can ignore this source of
packet loss also.
Consequently, the primary reason for packet loss in a non-wireless IP network is due to buffer
overflow resulting from congestion. The amount of packet loss in a network is typicallyexpressed in terms of the probability that a given packet will be discarded by the network.
Buffer OverflowBroken Link Corruption
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
18/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 18
IP networks do not carry a constant load, because traffic is bursty, and this causes the load on
the network to vary over time. There are periods when the volume of traffic that the network isasked to carry exceeds the capacity of some of the components in the network. When thisoccurs, congested network nodes attempt to reduce their load by discarding packets. When the
TCP/IP stack on host systems detects a packet loss, it assumes that the packet loss is due tocongestion somewhere in the network.
Packet Loss Can Be Good
It is important to understand that packet loss in an IP network is not always a bad thing. EachTCP session seeks all of the bandwidth that it can for its flow, but it must find the maximum
bandwidth without causing sustained congestion in the network. TCP accomplishes this by
transmitting slowly at the beginning of each session (slow-start), and then increasing thetransmission rate until it eventually detects the loss of a packet. Since TCP understands that a
packet drop means congestion is present at some point in the network, it reacts to thecongestion by temporarily reducing the transmission rate of the flow. Given enough time, each
TCP flow will eventually settle on the maximum bandwidth it can get across the networkwithout experiencing sustained congestion. When multiple TCP flows do this in parallel, theresult is fairness for all TCP sessions across the network. Thus, occasional packet loss is good,
because each TCP session needs to experience some amount of packet loss to find all of thebandwidth that it can to handle its flow.
Host response to network congestion is the same whether your IP network runs over packetsor over an ATM infrastructure, because TCP congestion-avoidance mechanisms are executed at
the transport layer, not the data link layer. An ATM transport does not possess remarkableproperties that allow you to better control the amount of traffic that a host injects into your
network, because the applications are native-IP-based, not ATM-based. If you need to controlend-system behavior, you are still required to perform traffic policing or shaping at the ingress
edges of your network. ATM can only support this at a relatively coarse level, because it is notaware of TCP/IP or the operation of its congestion-avoidance mechanisms. In fact, runningTCP/IP over an ATM infrastructure has a number of well-known limitations (cell tax, the
number of routing adjacencies required, the inability to identify IP packets in the core of the
network without reassembly, and so forth) that may actually obscure congestion-avoidanceissues, because there are more network layers that can hide this problem.
Mindful that a certain amount of packet loss is to be expected in any IP network, how can you
support differentiated service classes for specific customers or applications by arranging forsome packets to be treated differently from other packets with respect to packet loss? Assume
that you offer a fixed amount of bandwidth between two points in your network. As long asthe total amount of traffic sent along the path between these two points is less than or equal to
the agreed-upon throughput, there should be minimal packet loss after TCP sessions stabilize.This assumption allows us to support the differentiated treatment of packets with respect toloss by deploying multiple queues on each port, rather than just a single FIFO queue. The
output traffic stream is first classified, and then different types of packets are placed intodifferent queues. Finally, each queue is given a different share of the ports bandwidth. As long
as the amount of traffic placed into each of the queues is less than or equal to the agreed-uponbandwidth for the particular queue, then each queue should experience minimal packet loss
after the TCP sessions traversing the queue stabilize. (See Figure 8.)
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
19/35
Copyright 2001, Juniper Networks, Inc. 19
Supporting Differentiated Service Classes in Large IP Networks
Figure 8: Multiple Queues with Different Shares of a Ports Bandwidth
But what do you do if the amount of traffic placed into a given service class exceeds its
agreed-upon throughput? This becomes a policy decision with a number of options to managethe traffic load:
I Drop packets that are out-of-profile.
I Mark the packet, and then forward it with an increased drop probability. If theout-of-profile packet experiences congestion at a downstream node, it can be dropped
before other in-profile packets are dropped.
I Queue the packet, and then use traffic conditioning tools to control its rate on egress
I Transmit an explicit congestion notification (ECN) by setting the congestion experienced(CE) bit in the header of packets sourced from ECN-capable transport protocols.
It is important to note that, up to this point, we have limited our discussion of packet loss to thecase when a queue becomes 100 percent full. This mechanism is known as tail drop queue
management, because packets are dropped from the logical back, or tail, of the queue. (SeeFigure 9.)
Figure 9: Tail-drop Queue Management
Tail-drop queue management is a simple algorithm that is easy to implement. However, it doesnot discard packets fairly, because it allows a poorly behaved, bursty stream to consume all of
a queues resources, causing packets from other well-behaved streams to be discarded becausethe queue is 100 percent full.
30% Bandwidth Queue
20% Bandwidth Queue
10% Bandwidth Queue
Source 1
Source N
Source 6
Source 5
Source 4
Source 3
Source 2
40% Bandwidth Queue
Interface
Scheduler
Classifier
Full Queue
HeadTail
Tail-drop
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
20/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 20
Although you should expect a limited amount of packet loss in any network, significant packet
loss due to sustained congestion adversely affects the operation of your network. Sustainedcongestion creates critical problems for your network in several ways:
I The exchange of routing information is disrupted, and this can lead to route instability.
I The network is no longer able to absorb bursts of traffic.
I New TCP sessions cannot be established.
I Each of the existing TCP sessions traversing a heavily congested link begin to experiencesome amount of packet loss. Therefore, since packets from different sessions areinterleaved, all of the sessions begin to experience packet loss at roughly the same time.
This causes each of the individual sessions to go into slow-start, which creates aphenomenon known asglobal TCP synchronization. When this occurs, all of the TCP sessions
across the congested link become synchronized, resulting in periodic surges of traffic. As aresult, the link alternates between heavy congestion, as each TCP begins to seek its
maximum bandwidth, and light use, as all of the TCPs return to slow-start when they beginto experience congestion. This cycle repeats itself over and over again. Depending on wherethe congestion occurs in your network, this phenomenon can involve hundreds, thousands,
or even tens of thousands of TCP sessions.Random Early Detection (RED) is an active queue management mechanism that combats theproblem of global TCP synchronization, while also introducing a degree of fairness into thediscard-selection process.
A Brief History of Differentiated Services in Large IP Networks
The notion of providing more than just a single best-effort class of service has been part of the
IP architecture for more than 20 years. In this section, we examine some of the historicalapproaches to supporting differentiated service classes in large IP networks.
The First Approach: RFC 791In September 1981, RFC 791 standardized the Internet Protocol and reserved the second byte of
the IP header as the type of service (ToS) field. The bits of the ToS byte were defined as Figure10 shows.
Figure 10: RFC 791 Bit Definitions of ToS Bytes
The first three bits in the ToS byte (precedence bits) could be set by a node to select the relativepriority or precedence of the packet. The next three bits could be set to specify normal or low
delay (D), normal or high throughput (T), and normal or high reliability (R). The final two bitsof the ToS byte were reserved for future use. However, very little architecture was provided to
support the delivery of differentiated service classes in IP networks using these capabilities.
Precedence D T R Reserved
0 1 5 6 7432
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
21/35
Copyright 2001, Juniper Networks, Inc. 21
Supporting Differentiated Service Classes in Large IP Networks
The only application of the IP precedence bits until the mid-1990s was to support a feature
known as selective packet discard (SPD). SPD set the precedence bits for control packets(link-level keepalives, routing protocol keepalives, and routing protocol updates) so that, if thenetwork experienced congestion, critical control traffic would be the last to be discarded. The
goal was to enhance network stability during periods of congestion. In practice, the DTR bitswere never used.
The Second Approach: The Integrated Services Model (IntServ)
Around 1993, comprehensive work began in the IETF to develop a mechanism that would
allow IP to support more than a single best-effort class of service. The goal was to providereal-time service simultaneously with traditional non-real-time service in a shared IP network.
This work resulted in the development of the Integrated Services (IntServ) architecture. TheIntServ architecture is based onper-flow resource reservation.
IntServ Architecture
The IntServ architecture defined a reference model that specifies a number of different
components and the interplay among these components:I The resource reservation setup protocol (RSVP) that allows individual applications to
request resources from routers and then install per-flow state along the path of the packetflow.
I Two new service modelsguaranteed service and controlled load service. Guaranteedservice provides firm assurances (through strict admission control, bandwidth allocation,
and fair queuing) for applications that require guaranteed bandwidth and delay. Thecontrolled load service does not provide guaranteed bounds on bandwidth or delay, and
emulates a lightly loaded, best-effort network.
I Flow specifications that provide a syntax that allows applications to specify their specific
resource requirements.
I A packet classification process that examines incoming packets and decides which of thevarious classes of service should be applied to each packet.
I An admission control process that determines whether a requested reservation can besupported, based on the availability of both local and network resources.
I A policing and shaping process that monitors each flow to ensure that it conforms to itstraffic profile.
I A packet scheduling process that distributes network resources (buffers and bandwidth)among the different flows.
The IntServ model requires that source and destination hosts exchange RSVP signalingmessages to establish packet classification and forwarding state at each node along the path
between them. (See Figure 11.)
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
22/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 22
Figure 11: Resource Reservation Protocol (RSVP)
While people in the industry learned a tremendous amount during the development of IntServarchitecture, they eventually concluded that IntServ was not a suitable mechanism to support
the delivery of differentiated service classes in large IP networks.
I IntServ is not scalable, because it requires significant amounts of per-flow state and packetprocessing at each node along the end-to-end path. In the absence of state aggregation, theamount of state that needs to be maintained at each node scales in proportion to the
number of simultaneous reservations through a given node. The number of flows on a
high-speed backbone link could potentially range from tens of thousands to over a million.I IntServ requires that applications running on end systems support the RSVP signaling
protocol. There were very few operating systems that supported an RSVP API that
application developers could access.
I IntServ requires that all nodes in the network path support the IntServ model. This includes
the ability to the map IntServ service classes to link-layer technologies.
While the IntServ model failed, it led to the development and deployment of RSVP, which we
now use as a general-purpose signaling protocol for MPLS traffic engineering, fast LSPrestoration, and the rapid provisioning of optical links (GMPLS or MPLambdaS). RSVP
performs very well as a signaling protocol for MPLS because, in this application, it does notexperience the scalability problems associated with IntServ.
An IntServ Enhancement: Aggregation of RSVP Reservations
As discussed above, one of the major scalability limitations of RSVP is that it does not have the
ability to aggregate individually reserved sessions into a single, shared class. In September2001, RFC 3175 ("Aggregation of RSVP for IPv4 and IPv6 Reservations") defined procedures
that allow a single RSVP reservation to aggregate other RSVP reservations across a large IPnetwork. It proposed mechanisms to dynamically establish the aggregate reservation, identify
the specific traffic for which the aggregate reservation applies, determine how muchbandwidth is required to satisfy the reservation requirement, and reclaim bandwidth when thesubreservations are no longer required.
RFC 3175 enhances the scalability of RSVP for use in large IP networks by:
I Reducing the number of signaling messages exchanged and the amount of reservation state
that needs to be maintained by making a limited number of large reservations, rather than alarge number of small, flow-specific reservations,
I Streamlining the packet classification process in core routers by using the Differentiated
Services codepoint, or DSCP (see the discussion of DiffServ that follows), to identify anaggregated flow, instead of the traditional RSVP flow classification mechanism, and
I Simplifying packet queuing and scheduling by combining the aggregated streams into thesame queue on an output port.
Source Destination
PC
PCPath
Resv
Path
Resv
Path
Resv
Path
Resv
Path
Resv
RSVP RSVPRSVPRSVP RSVPRSVP
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
23/35
Copyright 2001, Juniper Networks, Inc. 23
Supporting Differentiated Service Classes in Large IP Networks
Among the potential applications for aggregation of RSVP reservations are these three:
I Interconnection of PSTN-call gateways across a provider backbone,
I Aggregation of RSVP paths at the edges of a provider network, and
I Aggregation of RSVP paths across the core of a provider network.
One of the strengths of RSVP is that it supports admission control on a per-flow basis. This canbe a powerful tool when supporting premium interactive voice services. Assume that you
establish an aggregated RSVP reservation to support 1000 voice calls. As long as there arefewer than 1000 active calls, a new call will be accepted by admission control, which will
allocate adequate bandwidth to support subscriber performance requirements. The 1001st callwill be denied access by admission control, thus preserving the quality of service delivered tothe 1000 established calls.
As you will see in the next section, the DiffServ model performs admission control on a
per-packet basis, not on a per-flow basis. This means that, at the edge of a DiffServ domain,calls 1001 through 1100 will be accepted but, because the service class is now out-of-profile,packets will be randomly dropped, thereby impacting the quality of service delivered for all of
the calls. You can overcome this feature of DiffServ by using a combination of aggregated
RSVP at the edges of the network to perform per-flow admission control for a voice gatewayplus DiffServ in the core of the network to support application performance requirementsacross the backbone.
The Third Approach: The Differentiated Services Model (DiffServ)
Around 1995 or 1996, service providers and various academic institutions began to examine
alternative approaches to supporting more than a single best-effort class of service, but thistime by using mechanisms that could provide the requisite scalability. As discussed in theprevious section, the failure of the IntServ model was due to the signaling explosion and the
amount of per-flow state that needed to be maintained at each node in the packet-forwardingpath. As a result, all of these new proposals sought to prevent these scalability issues. Figure 12
illustrates the cost, relative to complexity, of the new approaches to supporting differentiated
service classes.
Figure 12: Cost Relative to Complexity of Differentiated Services Solutions
At that time, there were a number of different proposals to redefine the meaning of the three
precedence bits in the ToS byte of the IP header. The proposals ranged from using a single bit,similar to the Frame Relay DE bit, to arbitrary bit definitions and even hybrid approaches,
where some bits were used for certain functions and the remaining bits were used for other
Increasing
Cost
Increasing Complexity
Best-effort
IntServ
DiffServ
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
24/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 24
functions. There was a lot of talk, some vendor code, but never any real production
deployment. The lack of successful deployment was because routers were software-based, andany attempt to make the packet forwarding process more complicated affected forwardingperformance, so it was simply easier to overprovision congested links.
By 1997, the IETF realized that IntServ was not going to be deployed in production networks,
and that the commercial sector had been thinking about supporting differentiated serviceclasses for specific customers or applications in a more coarse-grained and more scalable way
by using the IP precedence bits. As a result, the IETF created the DiffServ Working Group,
which met for the first time in March 1998. The goal of this group was to create relativelysimple and coarse methods of providing differentiated classes of service for Internet traffic, to
support various types of applications and specific business models.
The IETF Architecture for Differentiated Services
The DiffServ Working Group has changed the name of the IPv4 ToS octet to the DS byte and
defined new meanings for each of the bits. (See Figure 13.) The new specification for the DSField is applied to both the IPv4 ToS octet and the IPv6 traffic class octet, so that they use a
common set of mechanisms to support the delivery of differentiated service classes.
Figure 13: Differentiated Services Field (DS Field)
The IETFs DiffServ Working Group divides the DS byte into two subfields:
I The six high-order bits are known as the Differentiated Services codepoint (DSCP). TheDSCP is used by a router to select the per-hop behavior (PHB) that a packet experiences at
each hop within a Differentiated Services domain. PHB is an externally observableforwarding treatment applied to all packets that belong to the same service class or
behavior aggregate (BA).
I The two low-order bits are currently unused (CU) and reserved for future use. These two
bits are presently set aside for use by the explicit congestion notification (ECN) experiment.The values of the CU bits are ignored by each node when it determines the PHB to apply to
a packet.
The complete DiffServ architecture, defined in RFC 2475, is based on a relatively simple model,
whereby traffic that enters a network is first classified and then possibly conditioned at the
edges of the network. Depending on the result of the packet classification process, each packetis associated with one of the BAs supported by the Differentiated Services domain. The BA thateach packet is assigned to is indicated by the specific value carried in the DSCP bits of the DS
Field. When a packet enters the core of the network, each router along the transit path appliesthe appropriate PHB, based on the DSCP carried in the packets header. It is this combinationof traffic conditioning (policing and shaping) at the edges of the network, packet marking at
the edges of the network, local per-class forwarding behaviors in the interior of the network,and adequate network provisioning that allow the DiffServ model to support scalable service
discrimination across a common IP infrastructure.
0 1 5 6 7432
Differentiated Services Codepoint (DSCP) CU
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
25/35
Copyright 2001, Juniper Networks, Inc. 25
Supporting Differentiated Service Classes in Large IP Networks
Differentiated Services Domain (DS Domain)
A Differentiated Services domain (DS domain) is a contiguous set of routers that operate withcommon sets of service provisioning policies and PHB group definitions. (Figure 14.) A DSdomain is typically managed by a single administrative authority that is responsible for
ensuring that adequate network resources are available to support the service level
specifications (SLSs) and traffic conditioning specifications (TCSs) offered by the domain.
Figure 14: Differentiated Services Domain (DS Domain)
A DS domain consists of DS boundary nodes and DS interior nodes.
I DS boundary nodes sit at the edges of a DS domain. DS boundary nodes function as both
DS ingress and egress nodes for different directions of traffic flows. When functioning as a
DS ingress node, a DS boundary node is responsible for the classification, marking, andpossibly conditioning of ingress traffic. It classifies each packet, based on an examination ofthe packet header, and then writes the DSCP to indicate one of the PHB groups supportedwithin the DS domain. When functioning as a DS egress node, the DS boundary node may
be required to perform traffic conditioning functions on traffic forwarded to a directlyconnected peering domain. DS boundary nodes connect a DS domain to another DS
domain or another non-DS-capable domain.
I DS interior nodes select the forwarding behavior applied to each packet, based on an
examination of the packets DSCP (they honor the PHB indicated in the packet header). DSinterior nodes map the DSCP to one of the PHB groups supported by all of the DS interior
nodes within the DS domain. DS interior nodes connect only to another DS interior node orboundary node within the same DS domain.
Differentiated Service Router Functions
Figure 15 provides a logical view of the operation of a packet classifier and traffic conditioner
on a DiffServ-capable router.
DS Egress
Boundary Node
DS Ingress
Boundary Node
DS Interior
Node
DS Interior
Node
BA Traffic
Microflows
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
26/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 26
Figure 15: Packet Classifier and Traffic Conditioner
Packet Classification
A packet classifier selects packets in a traffic stream based on the content of fields in the packetheader. The DiffServ architecture defines two types of packet classifiers:
I A behavior aggregate (BA) classifier selects packets based on the value of the DSCP only.
I A multifield (MF) classifier selects packets based on a combination of the values of one or
more header fields. These fields can include the source address, destination address, DSField, protocol ID, source port, destination port, or other information, such as the incoming
interface. The result of the classification is written to the DS Field to simplify the packetclassification task for nodes in the interior of the DS domain.
After the packet classifier identifies packets that match specific rules, the packet is directed to alogical instance of a traffic conditioner for further processing.
Traffic ConditioningA traffic conditioner may consist of various elements that perform traffic metering, marking,shaping, and dropping. A traffic conditioner is not required to support all of these functions.
I A meter measures a traffic stream to determine whether a particular packet from the streamis in-profile or out-of-profile. The meter passes the in-profile or out-of-profile stateinformation to other traffic conditioning elements so that different conditioning actions can
be applied to in-profile and out-of-profile packets.
I A marker writes (or rewrites) the DS Field of a packet header to a specific DSCP, so that thepacket is assigned to a particular DS behavior aggregate.
I A shaper delays some or all packets in a traffic stream to bring the stream into conformancewith its traffic profile.
I A dropper (policer) discards some or all packets in a traffic stream to bring the stream intoconformance with its traffic profile.
Meter
MarkerShaper/
Dropper
Packets Packet
Classifier
Traffic Conditioner
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
27/35
Copyright 2001, Juniper Networks, Inc. 27
Supporting Differentiated Service Classes in Large IP Networks
Differentiated Services Router Functions
Figure 16 illustrates the functions that are typically performed by DS boundary routers and DSinterior routers.
Figure 16: DiffServ Router Functions
The DS ingress boundary router generally performs MF packet classification and trafficconditioning functions on incoming microflows. A microflow is single instance of an
application-to-application flow that is ultimately assigned to a behavior aggregate. A DSingress boundary router can also apply the appropriate PHB, based on the result of this packet
classification process.
NOTE A DS ingress boundary router may also perform BA packet classification if it trusts anupstream DS domains packet classification.
A DS interior router usually performs BA packet classification to associate each packet with a
behavior aggregate. It then applies the appropriate PHB by using specific buffer-managementand packet-scheduling mechanisms to support the specific packet-forwarding treatment.
Although the DiffServ architecture assumes that the majority of complex packet classificationand conditioning occurs at DS boundary routers, the use of MF classification is also supportedin the interior of the network.
The DS egress boundary router normally performs traffic shaping as packets leave the DSdomain for another DS domain or non-DS-capable domain. A DS egress boundary router mayalso perform MF or BA packet classification and precedence rewriting if it has an agreementwith a downstream DS domain.
DS Egress
Boundary Router
DS Ingress
Boundary Router
DS Interior
Router
DS Interior
Router
BA Traffic
Microflows
MF Classification
Traffic Metering
Packet Marking
Traffic Shaping/Dropping
BA Classification
PHB Support
Traffic Metering
Packet Marking
Traffic Shaping/Dropping
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
28/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 28
Per-hop Behaviors (PHBs)
A per-hop behavior (PHB) is a description of the externally observable forwarding behaviorapplied to a particular behavior aggregate. The PHB is the means by which a DS node allocatesits resources to different behavior aggregates. The DiffServ architecture supports the delivery
of scalable service discrimination, based on this hop-by-hop resource allocation mechanism.
PHBs are defined in terms of the behavior characteristics that are relevant to a providers
service provisioning policies. A specific PHB may be defined in terms of:
I The amount of resources allocated to the PHB (buffer size and link bandwidth),
I The relative priority of the PHB compared with other PHBs, or
I The observable traffic characteristics (delay, jitter, and loss).
However, PHBs are not defined in terms of specific implementation mechanisms.
Consequently, a variety of different implementation mechanisms may be acceptable forimplementing a specific PHB group.
The IETF DiffServ Working Group has defined two PHBs:
IExpedited forwarding PHB
I Assured forwarding PHB
In the future, new DSCPs can be assigned by a provider for its own local use or by newstandards activity.
Expedited Forwarding (EF PHB)
According to the IETFs DiffServ Working Group, the Expedited Forwarding (EF) PHB isdesigned to provide low loss, low delay, low jitter, assured bandwidth, end-to-end service.In effect, the EF PHB simulates a virtual leased line to support highly reliable voice or video
and to emulate dedicated circuit services. The recommended DSCP for the EF PHB is 101110.
Since the only aspect of delay that you can control in your network is the queuing delay, you
can minimize both delay and jitter when you minimize queuing delays. Thus, the intent of theEF PHB is to arrange that suitably marked packets encounter extremely short or empty queues
to ensure minimal delay and jitter. You can achieve this only if the service rate for EF packetson a given output port exceeds the usual rate of packet arrival at that port, independent of the
load on other (non-EF) PHBs.
The EF PHB can be supported on DS-capable routers in serveral ways:
I By policing EF microflows to prescribed values at the edge of the DS domain (this isrequired to ensure that the service rate for EF packets exceeds their arrival rate in the core
of the network),
I By ensuring adequate provisioning of bandwidth across the core of your network,
I By placing EF packets in the highest strict-priority queue and ensuring that the minimum
output rate is at least equal to the maximum input rate, or
I By rate-limiting the EF aggregate load in the core of your network to prevent inadequate
bandwidth for other service classes.
Generally, you will not use RED as a queue memory-management mechanism when
supporting the EF PHB, because the majority of the traffic is UDP-based, and UDP does notrespond to packet drops by reducing its transmission rate.
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
29/35
Copyright 2001, Juniper Networks, Inc. 29
Supporting Differentiated Service Classes in Large IP Networks
Assured Forwarding (AF PHB)
The Assured Forwarding (AF) PHB is a group of PHBs designed to ensure that packets areforwarded with a high probability of delivery, as long as the aggregate traffic in a forwarding
class does not exceed the subscribed information rate. If ingress traffic exceeds its subscribedinformation rate, then out-of-profile traffic is not delivered with as high a probability as traffic
that is in-profile.The AF PHB group includes four traffic classes. Packets within each AF class can be marked
with one of three possible drop-precedence values. The AF PHB group can be used toimplement Olympic-style service that consists of three service classes: gold, silver, and bronze.If you wish, you can further differentiate packets within each class by giving them either low,
medium, or high drop precedence within the service class. Table 2 summarizes therecommended DSCPs for the four AF PHB groups.
Table 2: Recommended AF DiffServ Codepoint (DSCP) Values
The AF PHB groups have not been assigned specific service definitions by the DiffServ
Working Group. The groups can be viewed as the mechanism that allows a provider to offerdifferentiated levels of forwarding assurances for IP packets. It is the responsibility of each DSdomain to set the quantitative and qualitative differences between AF classes.
In a DS-capable router, the level of forwarding assurance for any given packet depends on:
I The amount of bandwidth and buffer space allocated to the packets AF class,
I The amount of congestion for the AF class within the router, and
I The drop precedence of the packet.
The AF PHB group can be supported on DS-capable routers by:
I Policing AF microflows to prescribed values at the edge of the DS domain,
I Ensuring adequate provisioning of bandwidth across the core of your network,
I Placing each AF service class into a separate queue,
I Selecting the appropriate queue scheduling discipline to allocate buffer space and
bandwidth to each AF service class, and
I Configuring RED to honor the three low-order bits in the DSCP to determine how
aggressively a packet is dropped during periods of congestion.
Default PHB
RFC 1812 specifies the default PHB as the conventional best-effort forwarding behavior. Whenno other agreements are in place, all packets are assumed to belong to this traffic aggregate. A
packet assigned to this aggregate may be sent into a network without following any specific
AF Class 1 AF Class 2 AF Class 3 AF Class 4
Low drop precedence 001010 010010 011010 100010
Medium drop precedence 001100 010100 011100 100100
High drop precedence 001110 010110 011110 100110
-
7/27/2019 Supporting Differentiated Service Classes in Large IP Networks.pdf
30/35
Supporting Differentiated Service Classes in Large IP Networks
Copyright 2001, Juniper Networks, Inc. 30
rules, and the network will deliver as many of these packets as possible, as soon as possible,
subject to other resource-policy constraints. The recommended DSCP for the default PHB is000000.
General Observations about Differentiated Services
In this section, we discuss general observations about the nature of the DiffServ architecture tohelp you understand what you can or should expect if you decide to deploy it. It is important
to maintain a healthy skepticism about DiffServ, because it does not provide a magic solutionthat has the ability to solve all of the congestion-related problems in your network.
DiffServ Does Not Create Free Bandwidth
Routers are statistical multiplexing devices; therefore, they can experience congestion when
the amount of traffic that needs to traverse a port exceeds the output ports capacity. Thisme