explaining structured errors in gigabit...

Seediscussions,stats,andauthorprofilesforthispublicationat:https://www.researchgate.net/publication/228773987

ExplainingStructuredErrorsinGigabitEthernet

ARTICLE·APRIL2005

READS

13

4AUTHORS,INCLUDING:

AndrewW.Moore

UniversityofCambridge

105PUBLICATIONS2,202CITATIONS

SEEPROFILE

Allin-textreferencesunderlinedinbluearelinkedtopublicationsonResearchGate,

lettingyouaccessandreadthemimmediately.

Availablefrom:AndrewW.Moore

Retrievedon:22January2016

https://www.researchgate.net/publication/228773987_Explaining_Structured_Errors_in_Gigabit_Ethernet?enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ%3D%3D&el=1_x_2

https://www.researchgate.net/publication/228773987_Explaining_Structured_Errors_in_Gigabit_Ethernet?enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ%3D%3D&el=1_x_3

https://www.researchgate.net/?enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ%3D%3D&el=1_x_1

https://www.researchgate.net/profile/Andrew_Moore8?enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ%3D%3D&el=1_x_4


https://www.researchgate.net/institution/University_of_Cambridge?enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ%3D%3D&el=1_x_6


Explaining Structured Errors in Gigabit Ethernet Andrew W. Moore, Laura B. James, Richard Plumb and Madeleine Glick

IRC-TR-05-032

March 2005

1

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Copyright © Intel Corporation 2003 * Other names and brands may be claimed as the property of others.

Explaining Structured Errors in Gigabit EthernetAndrew W. Moore

�, Laura B. James

�, Richard Plumb

�and Madeleine Glick

��University of Cambridge, Computer Laboratory

[email protected]�University of Cambridge, Department of Engineering, Centre for Photonic Systems�

lbj20,rgp1000 � @eng.cam.ac.uk�Intel Research, [email protected]

Abstract— A physical layer coding scheme is designed to

make optimal use of the available physical link, providing

functionality to higher components in the network stack.

This paper presents results of an exploration of the errors

observed when an optical Gigabit Ethernet link is subject

to attenuation. In the unexpected results, highlighted by

some data payloads suffering from far higher probability

of error than others, an analysis of the coding scheme

combined with an analysis of the physical light path leads

us to conclude that the cause is an interaction between the

physical layer and the 8B/10B block coding scheme. We

consider the physics of a common type of laser used in

optical communications, and how the frequency charac-

teristics of data may affect performance. A conjecture is

made that there is a need to build converged systems, with

the combinations of physical, data-link, and network layers

optimised to interact correctly. In the mean time, what will

become increasingly necessary is both an identification of

the potential for failure and the need to plan around it.

Topic Keywords: Optical Networks, Network

Architectures, System design.

I. INTRODUCTION

Many modern networks are constructed as a series of

layers. The use of layered design allows for the modular

construction of protocols, each providing a different

service, with all the inherent advantages of a module-

based design. Network design decisions are often based

on assumptions about the nature of the underlying layers.

For example, design of an error-detecting algorithm, such

as a packet checksum, will be based upon a number of

premises about the nature of the data over which it is to

work and assumptions about the fundamental properties

of the underlying communications channel over which it

is to provide protection.

Yet the nature of the modular, layered design of

network stacks has caused this approach to work against

the architects, implementers and users. There exists a

tension between the desire to place functionality in the

most appropriate sub-system, ideally optimised for each

incarnation of the system, and the practicalities of mod-

ular design intended to allow independent developers to

construct components that will inter-operate with each

other through well-defined interfaces. Past experience

has lead to assumptions being made in the construction

or operation of one layer’s design that can lead to

incorrect behaviour when combined with another layer.

There are numerous examples describing the problems

caused when layers do not behave as the architects

of certain system parts expected. One might be the

previous vendor recommendation (and common practice)

for disabling UDP checksums, because the underlying

Ethernet FRC was considered sufficiently strong, led

to data corruptions for UDP-based NFS and network

name (DNS) services [1]. Another example is the re-

use of the 7-bit scrambler for data payloads [2], [3].

The 7-bit scrambling of certain data payloads (inputs)

results in data that is (mis)identified by the SONET [4]

system as belonging to the control channel rather than the

information channel. Also, Tennenhouse [5] noted how

layered multiplexing had a significant negative impact

upon an application’s jitter.

It is our conjecture that this naıve layering, while often

considered a laudable property in computer communica-

tions networks, can lead to irreconcilable faults due to

differences in such fundamental measures as the number

and nature of errors in a channel, a misunderstanding of

the underlying properties or needs of an overlaid layer.

We concentrate upon the naıve layering present when

assumptions are made about the properties of protocol

layers below or above the one being designed. This is

further exacerbated by layers evolving as new technolo-

gies drive speed, availability, etc. Often such evolution is

not explicitly taken into account by existing layers (such

as networks, transport-systems or applications). In fact,

quite the opposite situation occurs where existing layers

are assumed to operate normally.

Outline

Section II describes our motivations for this work

including new optical networking technologies which

change previous assumptions about the physical layer

and the implications of the limits on the quantity of

power useable in optical networks.

We present a study of the 8B/10B block-coding system

of Gigabit Ethernet [6], the interaction between an (N,K)

block code, an optical physical layer, and data trans-

ported using that block code in Section III. In Section IV

we document our findings on the reasons behind the

interactions observed.

Further to our experiments with Gigabit Ethernet,

in Section V we illustrate how the issues we identify

have ramifications for systems with increasing physical

complexity. Alongside conclusions, Section VI states a

number of the directions that further work may take.

II. MOTIVATIONS

A. Optical Networks

Current work in all areas of networking has led

to increasingly complex architectures: our interest is

focused upon the field of optical networking, but this

is also true in the wireless domain. Our exploration

of the robustness of network systems is motivated by

https://www.researchgate.net/publication/2415972_When_the_CRC_and_TCP_checksum_disagree?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

the increased demands of these new optical systems.

Wavelength Division Multiplexing (WDM), capable for

example of 160 wavelengths each operating at 10 Gbps

per channel, able to operate at over 5,000 km without

regeneration [7], is a core technology in the current com-

munications network. However the timescale of change

for such long haul networks is very long: the same

scale as deployment, involving from months to years of

planning, installation and commissioning. To take advan-

tage of capacity developments at the shorter timescales

relevant to the local area network, as well as system

and storage area networks, packet switching and burst

switching techniques have seen significant investigation

(e.g. [8], [9]). Optical systems such as [10] and [11] have

lead us to recognise that the need for higher data-rates

and designs with larger numbers of optical components

are forcing us towards what traditionally have been

technical limits.

Further to the optical-switching domain, there have

been changes in the construction and needs of fibre-

based computer networks. In deployments containing

longer runs of fibre using large numbers of splitters for

measurement and monitoring and active optical devices,

the overall system loss may be greater than in today’s

point-to-point links and the receivers may have to cope

with much-lower optical powers. Increased fibre lengths

used to deliver Ethernet services, e.g., Ethernet in the

first mile [12], along with a new generation of switched

optical networks, are examples of this trend.

Additionally, we are increasingly impacted by operator

practice. For example, researchers have observed that

up to 60% of faults in an ISP-grade network are due

to optical events [13]: defined as ones where it was

assumed errors results directly from operational faults

of in-service equipment. While the majority of these will

be catastrophic events (e.g., cable breaks), a discussion

with the authors allows us to speculate that a non-trivial

percentage of these events will be due to the issues of

layer-interaction discussed in this paper.

B. The Power Problem

If all other variables are held constant an increase

in data rate will require a proportional increase in

transmitter power. A certain number of photons per bit

must be received to guarantee any given bit error rate

(BER), even if no thermal noise is present. The arrival

of photons at the receiver is a Poisson process. If 20

photons must arrive per bit to ensure a BER of ��for a given receiver, doubling the bit rate means that

the time in which the 20 photons must arrive is halved.

The number of photons sent per unit time must thus be

doubled (doubling the transmission power) to maintain

the BER.

The ramifications of this are that a future system

operating at higher rates will either require twice the

power (a 3dB increase), be able to operate with 3dB

less power for the given information rate (equivalent

to the channel being noisier by that proportion) or a

compromise between these. Fibre nonlinearities impose

https://www.researchgate.net/publication/220200908_Terabit_Burst_Switching?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

https://www.researchgate.net/publication/3233855_Transparent_optical_packet_switching_Network_architecture_and_demonstrators_in_the_KEOPS_project?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

https://www.researchgate.net/publication/261091934_5745_km_DWDM_transcontinental_field_trial_using_10_Gbits_dispersion_managed_solitons_and_dynamic_gain_equalization?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

https://www.researchgate.net/publication/237132660_Wavelength_Striped_Semi-synchronous_Optical_Local_Area_Networks?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

https://www.researchgate.net/publication/237128085_Optical_Local_Area_Network?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

limitations on the maximum optical power able to be

used in an optical network. Subsequently, we maintain

that a greater understanding of the low-power behaviour

of coding schemes will provide invaluable insight for

future systems.

For practical reasons including availability of equip-

ment, its wide deployment, a tractability of the problem-

space and and well documented behaviour we concen-

trate upon the 8B/10B codec.

C. 8B/10B Block Coding

The 8B/10B codec, originally described by Widmer &

Franaszek [14], is widely used. This scheme converts

8 bits of data for transmission (ideal for any octet-

orientated system) into a 10 bit line code. Although

this adds a 25% overhead, 8B/10B has many valuable

properties; a transition density of at least 3 per 10 bit

code group and a maximum run length of 5 bits for clock

recovery, along with virtually no DC spectral component.

These characteristics also reduce the possible signal

damage due to jitter, which is particularly critical in

optical systems, and can also reduce multimodal noise

in multimode fibre connections.

This coding scheme is royalty-free, well understood,

and sees current use in a wide range of applications.

In addition to being the standard Physical Coding Sub-

layer (PCS) for Gigabit Ethernet [6], it is used in the

Fibre Channel system [15]. This codec is also used for

the 800Mbps extensions to the IEEE 1394 / Firewire

standard [16], and 8B/10B will be the basis of coding

for the electrical signals of the upcoming PCI Express

standard [17].

The 8B/10B codec defines encodings for data octets

and control codes which are used to delimit the data

sections and maintain the link. Individual codes or

combinations of codes are defined for Start of Packet,

End of Packet, line Configuration, and so on. Also, Idle

codes are transmitted when there is no data to be sent

to keep the transceiver optics and electronics active. The

Physical Coding Sublayer (PCS) of the Gigabit Ethernet

specification [6] defines how these various codes are

used.

Individual ten-bit code-groups are constructed from

the groups generated by 5B/6B and 3B/4B coding on the

first five and last three bits of a data octet respectively.

During this process the bits are re-ordered, such that

the last bits of the octet for transmission are encoded at

the start of the 10-bit group. This is because the last

5 bits of the octet are encoded first, into the first 6

bits of code, and then the first 3 bits of the octet are

encoded to the final 4 transmitted bits. Some examples

are given in Table I; the running disparity is the sign

of the running sum of the code bits, where a one is

counted as 1 and a zero as -1. During an Idle sequence

between packet transmissions, the running disparity is

changed (if necessary) to -1 and then maintained at

that value. Both control and data codes may change

the running disparity or may preserve its existing value;

examples of both types are shown in Table I. The code-

https://www.researchgate.net/publication/247813277_The_Complete_PCIExpress_Reference?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

https://www.researchgate.net/publication/224103980_A_DC-Balanced_Partitioned-Block_8B10B_Transmission_Code?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

TABLE I

EXAMPLES OF 8B/10B CONTROL AND DATA CODES

Type Octet Octet bits Current RD - Current RD + Note

data 0x00 000 00000 100111 0100 011000 1011 preserves RD value

data 0xf2 111 10010 010011 0111 010011 0001 swaps RD value

control K27.7 111 11011 110110 1000 001001 0111 preserves RD value

control K28.5 101 11100 001111 1010 110000 0101 swaps RD value

group used for the transmission of an octet depends

upon the existing running disparity – hence the two

alternative codes given in the table. A received code-

group is compared against the set of valid code-groups

for the current-receiver running disparity, and decoded

to the corresponding octet if it is found. If the received

code is not found in that set, the specification states

that the group is deemed invalid. In either case, the

received code-group is used to calculate a new value for

the running disparity. A code-group received containing

errors may thus be decoded and considered valid. It

is also possible for an earlier error to throw off the

running disparity calculation causing a later code-group

may be deemed invalid because the running disparity

at the receiver is no longer correct. This can amplify

the effect of a single bit error at the physical layer.

Line coding schemes, although they handle many of the

physical layer constraints, can introduce problems. In the

case of 8B/10B coding, a single bit error on the line can

lead to multiple bit errors in the received data byte. For

example, with one bit error the code-group D0.1 (current

running disparity negative) becomes the code-group D9.1

(also negative disparity); these decode to give bytes with

4 bits of difference. In addition, the running disparity

after the code-group may be miscalculated, potentially

leading to future errors. There are other similar examples

in [6].

III. BIT ERROR RATE VERSUS PACKET ERROR

As illustration of the discontinuity of measurements

made by engineers of the physical layer and those

working in the network layers, we compare bit error rate

(BER) measurements versus measurements of packet-

loss. This Section contains an extended form of the

results first presented in James et al. [18].

A. Experimental Method

We investigate Gigabit Ethernet on optical fibre,

(1000BASE-X [6]) under conditions where the received

power is sufficiently low as to induce errors in the

Ethernet frames. We assume that while the Functional

Redundancy Check (FRC) mechanism within Ethernet is

sufficiently strong to catch the errors, the dropped frames

and resulting packet loss will result in a significantly

higher probability of packet errors than the norm for

https://www.researchgate.net/publication/2921990_Structured_Errors_in_Optical_Gigabit_Ethernet?el=1_x_8&enrichId=rgreq-0c4ccfad-ff6e-4727-baaf-48096b16563e&enrichSource=Y292ZXJQYWdlOzIyODc3Mzk4NztBUzoxMDI5MDM0OTg5MzYzMjhAMTQwMTU0NTUxNDgwOQ==

Fig. 1. Main test environment

certain hosts, applications and perhaps users.

In our main test environment an optical attenuator is

placed in one direction of a Gigabit Ethernet link. A

traffic generator feeds a Fast Ethernet link to an Ethernet

switch, and a Gigabit Ethernet link is connected between

this switch and a traffic sink and tester (Figure 1). The

variable optical attenuator and an optical isolator are

placed in the fibre in the direction from the switch to

the sink. We had previously noted interference due to

reflection and took the precaution to remove this aspect

from the results.

A packet capture and measurement system is im-

plemented within the traffic sink using an enhanced

driver for the SysKonnect SK-9844 network interface

card (NIC). Among a number of additional features,

the modified driver allows application processes to re-

ceive error-containing frames that would normally be

discarded. As well as purpose-built code for the receiving

system we use a special-purpose traffic generator and

comparator. Pre-constructed test data in tcpdump-format

is transmitted from one or more traffic generators using

an adapted version of tcpfire [19]. Transmitted frames are

compared to their received versions and if they differ,

both original and errored frames are stored for later

analysis.

1) Octet Analysis: Each octet for transmission is

coded by the Physical Coding Sublayer of Gigabit

Ethernet using 8B/10B into a 10 bit code-group or

symbol, and we analyze these for frames which are

received in error at the octet level. By comparing the

two possible transmitted symbols for each octet in the

original frame to the two possible symbols corresponding

to the received octet we can deduce the bit errors which

occurred in the symbol at the physical layer. In order

to infer which symbol was sent and which received,

we assume that the combination giving the minimum

number of bit errors on the line is most likely to have

occurred. This allows us to determine the line errors

which most probably occurred.

Various types of symbol damage may be observed.

One of these is the single-bit error caused by the low

signal to noise ratio at the receiver. A second form of

error results from a loss of bit clock causing smeared

bits: where a subsequent bit is read as having the value

of the previous bit. A final example results from the

loss of symbol clock synchronization. This can lead

to the symbol boundaries being misplaced, so that a

sequence of several symbols, and thus several octets, will

be incorrectly recovered.

2) Real Traffic: Some results presented here are

conducted with real network traffic referred to as the

day-trace. This network traffic was captured from the

interconnect between a large research institution and

the Internet over the course of two working days. We

consider it to contain a representative sample of network

traffic for an academic/research organisation of approx-

imately 150 users.

Other traffic tested included pseudo-random data,

consisting of a sequence of frames of the same number

and size as the day-trace data, although each is filled

with a stream of octets whose values were drawn from a

pseudo-random number generator. Structured test data

consists of a single frame containing repeated octets:

0x00–0xff, to make a frame 1500 octets long. The low

error testframe consists of 1500 octets of 0xCC data

(selected for a low symbol error rate); the high error

testframe is 1500 octets of 0x34 data (which displays a

high symbol error rate).

3) Bit Error Rate Measurements: Optical systems

experiments commonly use a Bit Error Rate Test kit

(BERT) to assess the behaviour of the system [20].

This comprises both a data source and a receiving

unit and compares the incoming signal with the known

transmitted one. Individual bit errors are counted both

during short sampling periods and over a fixed period

(defined in time, bits, or errors). The output data can be

used to modulate a laser and a photodiode can be placed

before the BERT receiver to set up an optical link; optical

system components can then be placed between the two

parts of the BERT for testing. Usually, a pseudo-random

bit sequence is used but any defined bit sequence can be

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 256 512 768 1024 1280 1536

Offset of octet within frame

Structured test data Pseudo-random data

Fig. 2. Cumulative distribution of errors versus position in frame

transmitted repeatedly and the error count examined.

For the BER measurements presented here, a directly

modulated 1548nm laser was used. The optical signal

was then subjected to variable attenuation before return-

ing via an Agilent Lightwave (11982A) receiver unit

into the BERT (Agilent parts 70841B and 70842B). The

BERT was programmed with a series of bit sequences,

each corresponding to a frame of Gigabit Ethernet data

encoded as it would be for the line in 8B/10B. Purpose-

built code is used to convert a frame of known data

into the bit-sequence suitable for the BERT. The bit

error rates for these packet bit sequences were measured

at a range of attenuation values, using identical BERT

settings for all frames (e.g. 0/1 thresholding value).

B. Initial Results

We illustrate how errors are position independent but

dependent upon the encoded data. Figure 2 contrasts the

position of errors in the pseudo-random and structured

frames. The day-trace results are not shown because

this data, consisting of packets of varying length, is not

directly comparable. The positions of the symbols most

0.4

0.5

0.6

0.7

0.8

0.9

1

5 10 15 20 25 30 35 40 45 50Symbols in error per Frame

Structured test dataHigh error testframe

Low error testframe

Fig. 3. Cumulative distribution of symbol errors per frame

subject to error can be observed. For the random data the

position distribution is uniform — illustrating that there

is no frame-position dependence in the errors — but the

structured data clearly shows that certain payload-values

display significantly higher error-rates.

Figure 3 indicates the number of incorrectly received

symbols per frame. The low error testframe generated

no symbols in error and thus no frames in error. The

high error testframe results show an increase in errored

symbols per frame. It is important to note that the

errors occur uniformly across the whole packet and that

there are no correlations evident between the positions

of errors within the frame. We interpret this result as

confirming that errors are highly localised within a frame

and from this we are able to assume that the error-

inducing events occur over small (bit-time) time scales.

Figures 4(b) and 4(a) show packet error rate and bit

error rate, respectively, for a range of received optical

power. The powers in these two figures differ due to the

different experimental setup; packet error is measured

using the SysKonnect 1000BASE-X NIC as a receiver,

and bit error with an Agilent Lightwave receiver, each

have different sensitivities. We see that these different

testframes lead to substantially different BER perfor-

mance. Importantly the relationship between the test data

and BER results has little relationship with the packet-

error rates for the same test data. While not presented

here, the results of Figure 4 have been validated for an

optical power-range of 4dB.

What we can illustrate here is that a uniformly-

distributed set of random data, once encoded with

8B/10B will not suffer code-errors with the same uni-

formity. We considered that the Gigabit Ethernet PCS,

8B/10B, was actually the cause of this non-uniformity.

While not illustrated here, sets of wide-ranging ex-

periments allowed us to conclude that Ethernet-frames

containing a given octet of certain value were up to

100 times more likely to be received in error (and thus

dropped), when compared with a similar-sized packet

that did not contain such octets.

IV. CAUSE AND EFFECT

A. Effects on data sequences

We have found that individual errored octets do not

appear to be clustered within frames but are independent

of each other. However, we are interested in whether

earlier transmitted octets have an effect on the likelihood

of a subsequent octet being received in error. We had

anticipated that the use of running-disparity in 8B/10B

would present itself as correlation between errors in

current codes and the value of previous codes.

1e-11

1e-10

1e-09

1e-08

1e-07

1e-06

1e-05

-8.6 -8.4 -8.2 -8 -7.8 -7.6

Bit

erro

r rat

e

Received power / dBm


Low error testframe

(a) Bit error rate versus received power

1e-06

1e-05

0.0001

0.001

0.01

0.1

-23.4 -23.2 -23 -22.8 -22.6 -22.4 -22.2 -22

Pac

ket e

rror

rate

Received power / dBm


Low error testframe

(b) Packet error rate versus received power

Fig. 4. Contrasting packet-error and bit-error rates versus received power

We collect statistics on how many times each trans-

mitted octet value is received in error, and also store

the sequence of octets transmitted preceding this. The

error counts are stored in 2D matrices (or histograms)

of size �� , representing each pair of octets in

the sequence leading up to the errored octet: one for

the errored octet and its immediate predecessor, one

for the predecessor and the octet before that, and so

on. We normalise the error counts for each of these

histograms by dividing by the matrix representing the

frequency of occurrence of this octet sequence in the

original transmitted data. We then scale each histogram

matrix so that the sum of all entries in each matrix is 1.

Figure 5(a) shows the error frequencies for the “cur-

rent octet” �� (the correct transmitted value of octets

received in error), on the x-axis, versus the octet which

was transmitted before each specific errored octet, �� ,on the y-axis. Figure 5(b) shows the preceding octet

and the octet before that: �� vs �� . Vertical lines

in Figure 5(a) are indicative of an octet that is error-

prone independently of the value of the previous octet.

In contrast, horizontal bands indicate a correlation of

errors with the value of the previous octet.

It can be seen from Figure 5 that while correlation

between errors and the value in error, or the immediately

previous value, are significant, beyond this there is no

apparent correlation. The equivalent plot for �� produces a featureless white square.

B. 8B/10B code-group frequency components and their

effects

It is illustrative to consider the octets which are most

subject to error, and the 8B/10B codes used to represent

them. In the psuedo-random data, the following ten

octets give the highest error probabilities (independent

of the preceding octet value): 0x43, 0x8A, 0x4A, 0xCA,

0x6A, 0x0A, 0x6F, 0xEA, 0x59, 0x2A. It can be seen

that these commonly end in A, and this causes the first

5 bits of the code-group to be 01010. The octets not

0xFF

0xFF

x

x

0x7F

0x7F

i - 1

0x000x00

i

(a) Error counts for �! vs. �" $#&%

0xFF

0xFF

x

x

0x7F

0x7F

i - 2

0x000x00

i - 1

(b) Error counts for �! $#&% vs. �" $#�'Fig. 5. Error counts for pseudo-random data octets

beginning with this sequence in general contain at least 4

alternating bits. Of the ten octets giving the lowest error

probabilities (independent of previous octet), which are

0xAD, 0xED, 0x9D, 0xDD, 0x7D, 0x6D, 0xFD, 0x2D,

0x3D and 0x8D, the concluding D causes the code-

groups to start with 0011.

Fast Fourier Transforms (FFTs) were generated for

data sequences consisting of repeated instances of the

0 200 400 6250

0.5

1.0

Frequency / MHz

(a) FFT of code-group

for high error octet

0x4A

0 200 400 6250

0.5

1.0

Frequency / MHz

(b) FFT of code-group

for high error octet

0x0A

0 200 400 6250

0.5

1.0

Frequency / MHz

(c) FFT of code-group

for low error octet

0xAD

0 200 400 6250

0.5

1.0

Frequency / MHz

(d) FFT of code-group

for low error octet 0x9D

Fig. 6. Contrasting FFTs for a selection of code-groups

code-groups of 8B/10B. Examining the FFTs of the

code-groups for the high error octets, Figures 6(a)

and 6(b), for example, the peak corresponding to the base

frequency (625MHz, half the line rate) is pronounced in

most cases, although there is no such feature in the FFTs

of the code-groups of the low error octets (Figures 6(c)

and 6(d)).

The pairs of preceding and current octets leading

to the greatest error (which are most easily observed

in Figure 5) give much higher error probabilities than

the individual octets. The noted high error octets (eg.

0x8A) do occur in the top ten high error octet pairs and

0 200 400 600 800 10000

time in ps

Rec

eive

r out

put a

mpl

itude

traces from "good" data blocks

traces from "bad" 10101010 block

0.1

0.2

0.3

0.4

0.5

460 480 500 520

0.04

0.05

0.06

Fig. 8. Eye diagram: DFB laser at 1.25Gbps NRZ

normally follow an octet giving a code-group ending in

10101 or 0101, such as 0x58, which serves to further

empasise that frequency component.

The 8B/10B codec defines both data and control

encodings, and these are represented on a 1024x1024

space in Figure 7(a), which shows valid combinations

of the current code-group ( (!� ) and the preceding one

( ()� �� ). The regions of valid and invalid code-groups are

defined by the codec’s use of 3B/4B and 5B/6B blocks

(see II-C).

In Figure 7(a) the octet errors found in the day-trace

have been displayed on this codespace, showing the

regions of high error concentration for real Internet data.

It can be seen that these tend to be clustered and that

the clusters correspond to certain features of the code-

groups. Two groups of clusters have been ringed, those

that are indicated as (!�+*,��.-�-�- represent those codes

with a low-error suffix. In contrast the ringed values

indicated as ( � */��.-�-�- indicates the error-prone

symbols with a suffix of 0xA.

C. Laser Physics

An eye diagram is commonly used to assess the

physical layer of a communications system, where the

effects of noise, distortion and jitter can be seen in

a single diagram [20]. Received analogue waveforms

(before any decision circuit) for each bit of a pseudo-

random sequences are superimposed as in Figure 8.

The central open part of the eye represents the margin

between “1”s and “0”s, where both the height (repre-

senting amplitude margin) and the width (representing

timing margin) are significant. The diagram shown is

computed and includes statistical jitter and amplitude

noise from the laser source but not thermal noise in the

receiver, the inclusion of which would have broadened

the lines significantly and makes the whole diagram less

clear. Electrically, semiconductor lasers are just simple

diodes, but the interaction between electron and photon

populations within the device makes the modulation

response complex. A first-order representation of the

laser and driver may be obtained via a pair of rate

equations, one each for electrons (N) and photons (P):

021043 *

5687:9<;8=

0�>0&1@? 1 9 1BADCFE 9 ?HG 1JI ( 1 � C 9<K 1 �

0&E023 *,L ;8=

0�>0&1 ? 1 9 1BADCFEMI L+N K 1 � 9

EO�P

However, DFB lasers at frequencies above 1 Gbit/s

really need multiple coupled equations of this sort in

order to account for spatial variations within the laser,

as described in Carroll et al. [21]. A significant range

of behaviour is possible as bias, drive conditions, and

0x200

0x200

0x400

i - 1C

iC 0x4000x000x00

Invaliddata,data

control, datacontrol,control

data, control or

(a) Valid QSR $#&%UTWV Q pairs

0x400

C i0x200 0x400

0x200

Ci -

1

C = 0011... C = 010101...i i

0x000x00

(b) Errors using day-trace as a function of code-groups

Fig. 7. The codebook for 8B/10B represented on a 1024x1024 space

physical structure vary. In general, jitter becomes much

worse if laser bias is reduced below threshold, and ampli-

tude noise in the “0” levels becomes a problem if bias is

increased. With ideal bias, just at threshold, some lasers

have sufficient “memory” to react to the high frequency

energy in 10101010 strings, and slight eye closure may

result, as modeled and shown in Figure 8. The effect is

small, but enough to increase the probability of error

for such a data block. In addition, laser drive control

loops, receiver timing loops, and the more sophisticated

bandwidth limiting filters in receivers, could in principle

be disturbed slightly by particular bit sequences, and

hence give increased error rates for those sequences.

V. IMPLICATIONS

In Section III we documented the occurrence of error

hot-spots: data and data-sequences with a higher prob-

ability of error. In addition to increasing the chances

of frame-discard due to data-contents, the occurrence

of such hot-spots also has implications for higher level

network protocols. Frame-integrity checks, such as a

cyclic redundancy check, assume that there will be a

uniformity of errors within the frame, justifying detec-

tion of single-bit errors with a given precision. While

Jain [22] demonstrates that the FRC as used in Ethernet

is sufficiently strong as to detect all 1, 2 and 3 bit errors

for frames up to 8 KBytes in length, problems may be

encountered for certain combinations of errors above

this. Recall that in Section II-C we noted that many

single-bit errors on the physical layer will translate into

multi-bit errors following decoding by the PCS.

Network/Transport Layer Issues

Up until now we have concentrated upon the interac-

tion between the physical layer and the data-link layer,

such as that embodied in 1000BASE-X. We briefly note

the interaction that data-link layer effects may have with

the network and transport layer.

In our original paper on this work (James et al. [18])

we highlighted the non-uniform distribution of packet er-

rors that result from an interaction between the physical

coding conditions, the physical coding sub-layer of Giga-

bit Ethernet, and particular data to be transported through

the network. That work identified that certain data-values

had a substantially higher probability of being received

in error, which resulted in packets with those payloads

being discarded with a higher-than-normal probability.

An analysis of the contents of day-trace data along

with other data derived as part of our network-monitoring

work allows us to conclude that in addition to (user)

data-payloads the error-concentrating effects will cause a

significant level of loss due to the network and transport-

layer header contents. In one hypothetical case, if a user

was on a machine with an IP address that consisted

of several high-error-rate octets their data will be at

a proportionally higher risk of being corrupted and

discarded at the Gigabit Ethernet layer.

Further, the occurrence of error hot-spots has ram-

ifications quite independent of the Gigabit Ethernet

standard. Stone et al. [23] discuss the impact this has

for the checksum of TCP. These results may call into

question our assumption that only increased packet-loss

will be the result of the error hot-spots. Instead of just

lost packets, Stone et al. noted certain “unlucky” data

would rarely have errors detected. A further example of

related work is Jain [22] which illustrates how errors

will impact the PCS of FDDI and require specific error

detection/recovery in the data-link layer.

VI. CONCLUSIONS

In this paper we have illustrated how there is scope

for problems to develop in the design, development and

evolution of the physical components, algorithms, proto-

cols and applications that constitute our networks. These

problems are caused by the undesirable interactions

between network modules. They may be slow to evolve,

such as the inevitable frequency problems that have

arisen with the limited scrambler and SONET; alterna-

tively they may be more abrupt, such as the undesirable

interactions between ATM and TCP/IP documented by

Stone et al. [23].

We define naıve layering: the development of protocol

layers beyond the scope of the original specification.

Often this layer evolution is combined with assumptions

by a developer about the layers below and above. Best

described by example, naıve layering can result in the

construction or specification of a network module with-

out sufficient understanding of the layers with which it

must inter-operate.

Examining the 8B/10B code used by Gigabit Ethernet,

we have documented the form and cause of failures that

occur in the low-power regime, inducing, at best, poor

performance and, at worst, undetected errors that may

focus upon specific networks, applications and users.

The errors observed in Gigabit Ethernet in a low-power

regime are not uniform as may be assumed. Some

packets will suffer greater loss rates than the norm. This

content-specific effect is particularly insidious because it

occurs without a total failure of the network.

We applied a scrambler as a form of noise-whitener

and were able to successfully illustrate that its impact

removed the error hot-spotting in the data-space.

We have seen how future optical networks will consist

of an increasingly large number of diverse elements

and will have very limited optical power budgets. This

will cause a fundamental change in the physical optical

layer, which will also impact the construction of these

networks. Alongside this we would maintain that these

changes will lead to occurrences of naıve layering be-

coming more prevalent.

Future Work

We anticipate further study of the effects of scramblers

on Gigabit Ethernet, and by extension, on other N,K

block coded systems. In particular we would like to

fully understand the interactions between block coding

schemes and scrambling. We are also interested in the

effects of jitter in the interactions between the layers of

an optical network, as the jitter requirements for various

network specifications (e.g. Gigabit Ethernet, SONET)

are not just different in quantity, but measured in quite

different ways.

Acknowledgements

Many thanks to Ian White, Richard Penty, Derek

McAuley, Bradley Booth, Jon Crowcroft, David G.

Cunningham, Eric Jacobsen, Peter Kirkpatrick, Barry

O’Mahony, Ralphe Neill, Ian Pratt, Adrian P. Stephens,

Kevin Williams, and Adrian Wonfor. Additionally, An-

drew Moore acknowledges the Intel Corporation’s gener-

ous support of his research fellowship and Laura James

would like to thank Marconi for its support of her PhD

research.

REFERENCES

[1] J. Stone and C. Partridge, “When the CRC and TCP checksum

disagree,” in Proceedings of ACM SIGCOMM 2000. ACM

Press, Aug. 2000.

[2] W. Simpson, “PPP over SONET/SDH,” IETF, RFC 1619, May

1994.

[3] A. Malis and W. Simpson, “PPP over SONET/SDH,” IETF,

RFC 2615, June 1999.

[4] ANSI, “T1.105-1988, Synchronous Optical Network (SONET)

— Digital Hierarchy: Optical Interface Rates and Formats

Specification,” 1988.

[5] D. L. Tennenhouse, “Layered multiplexing considered harm-

ful,” in Protocols for High-Speed Networks. North Holland,

Amsterdam, May 1989.

[6] IEEE, “IEEE 802.3z — Gigabit Ethernet,” 1998, standard.

[7] A. R. Pratt et al., “5745 km DWDM transcontinental field trial

using 10Gbit/s dispersion managed solitons and dynamic gain

equalization,” in Proceedings of OFC-2003, 2003, post-deadline

paper PD26.

[8] J. S. Turner, “Terabit burst switching,” J. High Speed Netw.,

vol. 8, no. 1, pp. 3–16, 1999.

[9] P. Gambini et al., “Transparent optical packet switching: net-

work architecture and demonstrators in the KEOPS project,”

IEEE Journal on Selected Areas in Communications, vol. 16,

no. 7, pp. 1245–1259, Sept. 1998.

[10] D. McAuley, “Optical Local Area Network,” in Computer

Systems: Theory, Technology and Applications, A. Herbert and

K. Sparck-Jones, Eds. Springer-Verlag, Feb 2003.

[11] L. James, G. Roberts, M. Glick, D. McAuley, K. Williams,

R. Penty, and I. White, “Wavelength Striped Semi-synchronous

Optical Local Area Networks,” in London Communications

Symposium (LCS 2003), Sept. 2003.

[12] IEEE, “IEEE 802.3ah — Ethernet in the First Mile,” 2004,

standard.

[13] A. Markopoulou, G. Iannaccone, S. Bhattacharyya, C.-N.

Chuah, and C. Diot, “Characterization of failures in an IP

backbone,” in Proceedings of IEEE INFOCOMM 2004, Hong

Kong, Mar. 2004, Sprint ATL Research Report.

[14] A. X. Widmer and P. A. Franaszek, “A DC-Balanced,

Partitioned-Block, 8B/10B Transmission Code,” IBM Journal of

Research and Development, vol. 27, no. 5, pp. 440–451, Sept.

1983.

[15] The Fibre Channel Association, Fibre Channel Storage Area

Networks. Eagle Rock, VA: LLH Technology Publishing, 2001.

[16] IEEE, “IEEE 1394b — High-Performance Serial Bus,” 2002,

standard.

[17] E. Solari and B. Congdon, The Complete PCIExpress Reference.

Hillsboro, OR: Intel Press, 2003.

[18] L. B. James, A. W. Moore, and M. Glick, “Structured Errors in

Optical Gigabit Ethernet,” in Passive and Active Measurement

Workshop (PAM 2004), Apr. 2004.

[19] “tcpfire,” 2003, http://www.nprobe.org/tools/.

[20] R. Ramaswami and K. N. Sivarajan, Optical Networks. Morgan

Kaufmann, 2002, pp. 258–263.

[21] J. E. Carroll, J. Whiteaway, and R. Plumb, Distributed Feedback

Semiconductor Lasers, ser. IEE Circuits, Devices & Systems

Series. Co-published by the IEE and SPIE Press, 1998, no. 10.

[22] R. Jain, “Error Characteristics of Fiber Distributed Data Inter-

face (FDDI),” IEEE Transactions on Communications, vol. 38,

no. 8, pp. 1244–1252, 1990.

[23] J. Stone, M. Greenwald, C. Partridge, and J. Hughes, “Perfor-

mance of Checksums and CRCs over Real Data,” in Proceed-

ings of ACM SIGCOMM 2000, Stockholm, Sweden, Aug. 2000.

explaining structured errors in gigabit...

Documents