ieee/acm transactions on networking, vol.7, overhead byte stuffing ... but bit-oriented algorithms...
Post on 06-May-2018
Embed Size (px)
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL.7, NO. 2, APRIL 1999 Page 1 of 15
Consistent Overhead Byte StuffingStuart Cheshire and Mary Baker, Member, IEEE
AbstractByte stuffing is a process that encodes a sequence ofdata bytes that may contain illegal or reserved values, using apotentially longer sequence that contains no occurrences of thesevalues. The extra length is referred to here as the overhead of theencoding. To date, byte stuffing algorithms, such as those used bySLIP, PPP and AX.25, have been designed to incur low averageoverhead, but little effort has been made to minimize their worst-case overhead. However, there are some increasingly popularnetwork devices whose performance is determined more by theworst case than by the average case. For example, the transmis-sion time for ISM-band packet radio transmitters is strictly lim-ited by FCC regulation. To adhere to this regulation, the currentpractice is to set the maximum packet size artificially low so thatno packet, even after worst-case overhead, can exceed the trans-mission time limit.
This paper presents a new byte stuffing algorithm, called Con-sistent Overhead Byte Stuffing (COBS), which tightly bounds theworst-case overhead. It guarantees in the worst case to add nomore than one byte in 254 to any packet. For large packets thismeans that their encoded size is no more than 100.4% of theirpre-encoding size. This is much better than the 200% worst-casebound that is common for many byte stuffing algorithms, and isclose to the information-theoretic limit of about 100.07%. Fur-thermore, the COBS algorithm is computationally cheap, and itsaverage overhead is very competitive with that of existing algo-rithms.
Index TermsPacket, Serial, Transmission, Framing, Bytestuffing
HE PURPOSE of byte stuffing is to convert data packetsinto a form suitable for transmission over a serial medium
like a telephone line. When packets are sent over a serial me-dium there needs to be some way to tell where one packetends and the next begins, particularly after errors, and this istypically done by using a special reserved value to indicatepacket boundaries. Byte stuffing ensures, at the cost of a po-tential increase in packet size, that this reserved value does notinadvertently appear in the body of any transmitted packet. Ingeneral, some overhead (additional bytes transmitted over theserial medium) is inevitable if we are to perform byte stuffingwithout loss of information.
Current byte stuffing algorithms, such as those used by Se-rial Line IP (SLIP) [RFC1055], the Point-to-Point Protocol(PPP) [RFC1662] and AX.25 (Amateur Packet Radio)[ARRL84], have highly variable per-packet overhead. For ex-
ample, using conventional PPP byte stuffing, the total over-head (averaged over a large number of packets) is typically1% or less, but this overhead is not consistent for all packets.Some packets incur no overhead at all, while others incurmuch more; in principle some could increase in size by asmuch as 100%. High-level Data Link Control (HDLC) uses abit stuffing technique [ECMA-40] that has a worst-case ex-pansion of only 20%, but bit-oriented algorithms are oftenharder to implement efficiently in software than byte-orientedalgorithms.
While large variability in overhead may add jitter and un-predictability to network behavior, this problem is not fatal forPPP running over a telephone modem. An IP host [RFC791]using standard PPP encapsulation [RFC1662] need only makeits transmit and receive buffers twice as large as the largest IPpacket it expects to send or receive. The modem itself is aconnection-oriented device and is only concerned with trans-mitting an unstructured stream of byte values. It is unaware ofthe concept of packet boundaries, so unpredictability of packetsize does not directly affect modem design.
In contrast, new devices are now becoming available, par-ticularly portable wireless devices, that are packet-oriented,not circuit-oriented. Unlike telephone modems these devicesare aware of packets as distinct entities and consequently im-pose some finite limit on the maximum packet size they cansupport. Channel-hopping packet radios that operate in theunlicensed ISM (Industrial/Scientific/Medical) bands underthe FCC Part 15 rules [US94-15] are constrained by a maxi-mum transmission time that they may not exceed. If the over-head of byte stuffing unexpectedly doubles the size of apacket, it could result in a packet that is too large to transmitlegally. Using conventional PPP encoding [RFC1662], theonly way to be certain that no packets will exceed the legallimit is to set the IP maximum transmission unit (MTU) to halfthe devices true MTU, despite the fact that it is exceedinglyrare to encounter a packet that doubles in size when encoded.
Halving the MTU can significantly degrade the performanceseen by the end-user. For example, assuming one 40-byte TCPacknowledgement for every two data packets [RFC1122], aMetricom packet radio with an MTU of 1000 bytes achieves amaximum end-user throughput of 5424 bytes per second[Che96]. Halving the MTU to 500 bytes reduces the maximumend-user throughput to 3622 bytes per second, a reduction ofroughly 33%, which is a significant performance penalty.
Although, as seen from experiments presented in this paper,packets that actually double in size rarely occur naturally, it isnot acceptable to ignore the possibility of their occurrence.Without a factor-of-two safety margin, the network devicewould be open to malicious attack through artificially con-structed pathological packets. An attacker could use suchpackets to exploit the devices inability to send and/or receive
Manuscript received July 29, 1998; approved by IEEE/ACM TRANSACTIONSON NETWORKING Editor J. Crowcroft. This work was supported in partby Rockwell International Science Center, FX Palo Alto Laboratory,AT&T/Lucent, under NSF Faculty Career Grant CCR-9501799, and by agrant from the Keio Research Institute at SFC, Keio University & theInformation-Technology Promotion Agency of Japan, and by an Alfred P.Sloan Faculty Research Fellowship. This paper was presented in part atSIGCOMM 97, Cannes, France, September 1997.
Stuart Cheshire is with Apple Computer, Cupertino, CA 95014 USA.Mary Baker is with the Computer Science Department, Stanford University,
Stanford, CA 94305 USA.Publisher Item Identifier S 1063-6692(99)03826-1
10636692/99$10.00 1999 IEEE
Page 2 of 15 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL.7, NO. 2, APRIL 1999
worst-case packets, causing mischief ranging from simple de-nial of service attacks to the much more serious potential forexploiting array-bounds errors such as the one in the Unixfinger dmon exploited by the infamous Internet Worm of1989 [RFC1135].
One way to avoid the performance degradation of halvingthe MTU might be to set the IP MTU closer to the underlyingdevice MTU, and then use some kind of packet fragmentationand reassembly to handle those packets that, when encoded,become too large to send. Packet fragmentation is not withoutcost though, as Chris Bennett reported in [Ben82] where hewrote, As well as the global overheads such as endpoint de-lay, fragmentation generates overheads for other componentsof the system notably the intermediate gateways.
One kind of packet fragmentation is link-layer fragmenta-tion (transparent fragmentation). While this could work, re-quiring link-layer software to perform fragmentation and reas-sembly is a substantial burden to impose on device driverwriters. In [Kent87], Kent and Mogul wrote transparentfragmentation has many drawbacks gateway implementa-tions become more complex and require much more buffermemory Fragmentation and reassembly also providesmore opportunities for software bugs [CA-96.26] and addsprotocol overhead, because the link-layer packet headers haveto contain additional fields to support the detection and reas-sembly of fragments at the receiving end.
A second way to use fragmentation would be to avoid add-ing a new layer of fragmentation and reassembly at the devicelevel, and instead to use IPs own existing fragmentation andreassembly mechanisms, but this also has problems. Oneproblem is that in current networking software implementa-tions IP fragmentation occurs before the packet is handed tothe device driver software for transmission. If, after encodingby the device driver, a particular packet becomes too large tosend, there is no mechanism for the driver to hand the packetback to IP with a message saying, Sorry, I failed, can youplease refragment this packet into smaller pieces and tryagain? Also, some software goes to great lengths to select anoptimal packet size. It would be difficult for such software tocope with an IP implementation where the MTU of a devicevaries for every packet. Finally, it is unclear, if we were tocreate such a mechanism, how it would handle the IP headersDont Fragment bit without risking the creation of path MTUdiscovery [RFC1191] black holes.
All these problems could be avoided if we used a bytestuffing algorithm that did not have such inconsistent behav-ior. This paper presents a new algorithm, called ConsistentOverhead Byte Stuffing (COBS), which can be relied upon toencode all packets efficiently, regardless of their contents. It iscomputationally cheap, easy to implement in software, and hasa worst-case performance bound that is better even thanHDLCs bit stuffing scheme. All packets up to 254 bytes inlength are encoded with an overhead of exactly one byte. Forpackets over 254 bytes in length the overhead is at most onebyte for every 254 bytes of packet data. The maximum over-head is therefore roughly 0.4% of the packet size, rounded upto a whole number of bytes.
Using Consistent Overhead Byte Stuffing, the IP MTU maybe set as high as 99.6% of the underlying devices maxi