fault discovery protocol for passive optical networks

18
Fault discovery protocol for passive optical networks Marek Hajduczenia, 1,2, * Daniel Fonseca, 1,3 Henrique J. A. da Silva, 2 and Paulo P. Monteiro 1,4 1 Nokia Siemens Networks S. A., Rua Irmãos Siemens, No. 1, 2720-093 Amadora, Portugal 2 Departamento de Engenharia Electrotécnica e de Computadores, Instituto de Telecomunicações, Universidade de Coimbra, Pólo II, 3030-290 Coimbra, Portugal 3 Department of Electrical and Computer Engineering, Optical Communications Group, Instituto de Telecomunicações, Instituto Superior Técnico, 1049-001 Lisbon, Portugal 4 Instituto de Telecomunicações–Pólo de Aveiro, Universidade de Aveiro, 3810-193 Aveiro, Portugal * Corresponding author: [email protected] Received October 23, 2006; revised February 9, 2007; accepted April 13, 2007; published May 21, 2007 Doc. ID 76314 All existing flavors of passive optical networks (PONs) provide an attractive alternative to legacy copper-based access lines deployed between a central of- fice (CO) of the service provider (SP) and a customer site. One of the most challenging tasks for PON network planners is the reduction of the overall cost of employing protection schemes for the optical fiber plant while main- taining a reasonable level of survivability and reducing the downtime, thus ensuring acceptable levels of quality of service (QoS) for end subscribers. The recently growing volume of Ethernet PONs deployment [Kramer, IEEE 802.3, CFI (2006)], connected with low-cost electronic and optical components used in the optical network unit (ONU) modules, results in the situation where re- mote detection of faulty/active subscriber modules becomes indispensable for proper operation of an EPON system. The problem of the remote detection of faulty ONUs in the system is addressed where the upstream channel is flooded with the cw transmission from one or more damaged ONUs and stan- dard communication is severed, providing a solution that is applicable in any type of PON network, regardless of the operating protocol, physical structure, and data rate. © 2007 Optical Society of America OCIS codes: 060.4250, 060.4510. 1. Introduction Service providers (SPs) are increasingly looking to passive optical network (PON) solutions as the viable alternative to current copper-based broadband access technolo- gies, which are no longer considered as futureproof, especially in the light of the upcoming rollout of high definition (HD) TV, on-demand video services, and generally growing subscriber demand in terms of available bandwidth and quality of service (QoS). Additionally, there has been a steady change in the mentality of SPs, since ini- tially PON systems were considered cost prohibitive to deploy, mainly due to the need to replace copper cabling with fiber strands. However, as fiber deployment costs are steadily decreasing, and new multimedia applications are emerging, PON systems with their inherently low number of active equipment modules and highly optimized transmission protocols are now perceived as the cost-effective, futureproof conduit for delivering high-bandwidth voice, video, and data services. Simultaneously, PON system subscribers rely heavily on a few strands of optical fibers as their primary means of digital connectivity, and thus numerous concerns regarding the survivability of high-capacity fiber communications networks from any single point of failure are addressed by both industry and academia. Fortunately, the QoS loss and network downtime can be minimized by using survivable network plan- ning in architecture, technology, and design, though there is always a trade-off in the form of degree of confidence in network survivability versus cost of network imple- mentation. Choosing the right level of network availability and survivability relates directly to the SP’s core business, since customers are becoming increasingly aware of the QoS for the services which they pay for [1,2]. Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 701 1536-5379/07/060701-18/$15.00 © 2007 Optical Society of America

Upload: paulo-p

Post on 01-Oct-2016

231 views

Category:

Documents


10 download

TRANSCRIPT

Page 1: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 701

Fault discovery protocol for passiveoptical networks

Marek Hajduczenia,1,2,* Daniel Fonseca,1,3 Henrique J. A. da Silva,2 andPaulo P. Monteiro1,4

1Nokia Siemens Networks S. A., Rua Irmãos Siemens, No. 1,2720-093 Amadora, Portugal

2Departamento de Engenharia Electrotécnica e de Computadores, Instituto deTelecomunicações, Universidade de Coimbra, Pólo II, 3030-290 Coimbra, Portugal

3Department of Electrical and Computer Engineering, Optical Communications Group,Instituto de Telecomunicações, Instituto Superior Técnico,

1049-001 Lisbon, Portugal4Instituto de Telecomunicações–Pólo de Aveiro, Universidade de Aveiro,

3810-193 Aveiro, Portugal*Corresponding author: [email protected]

Received October 23, 2006; revised February 9, 2007; accepted April 13, 2007;published May 21, 2007 �Doc. ID 76314�

All existing flavors of passive optical networks (PONs) provide an attractivealternative to legacy copper-based access lines deployed between a central of-fice (CO) of the service provider (SP) and a customer site. One of the mostchallenging tasks for PON network planners is the reduction of the overallcost of employing protection schemes for the optical fiber plant while main-taining a reasonable level of survivability and reducing the downtime, thusensuring acceptable levels of quality of service (QoS) for end subscribers. Therecently growing volume of Ethernet PONs deployment [Kramer, IEEE 802.3,CFI (2006)], connected with low-cost electronic and optical components usedin the optical network unit (ONU) modules, results in the situation where re-mote detection of faulty/active subscriber modules becomes indispensable forproper operation of an EPON system. The problem of the remote detection offaulty ONUs in the system is addressed where the upstream channel isflooded with the cw transmission from one or more damaged ONUs and stan-dard communication is severed, providing a solution that is applicable in anytype of PON network, regardless of the operating protocol, physical structure,and data rate. © 2007 Optical Society of America

OCIS codes: 060.4250, 060.4510.

1. IntroductionService providers (SPs) are increasingly looking to passive optical network (PON)solutions as the viable alternative to current copper-based broadband access technolo-gies, which are no longer considered as futureproof, especially in the light of theupcoming rollout of high definition (HD) TV, on-demand video services, and generallygrowing subscriber demand in terms of available bandwidth and quality of service(QoS). Additionally, there has been a steady change in the mentality of SPs, since ini-tially PON systems were considered cost prohibitive to deploy, mainly due to the needto replace copper cabling with fiber strands. However, as fiber deployment costs aresteadily decreasing, and new multimedia applications are emerging, PON systemswith their inherently low number of active equipment modules and highly optimizedtransmission protocols are now perceived as the cost-effective, futureproof conduit fordelivering high-bandwidth voice, video, and data services.

Simultaneously, PON system subscribers rely heavily on a few strands of opticalfibers as their primary means of digital connectivity, and thus numerous concernsregarding the survivability of high-capacity fiber communications networks from anysingle point of failure are addressed by both industry and academia. Fortunately, theQoS loss and network downtime can be minimized by using survivable network plan-ning in architecture, technology, and design, though there is always a trade-off in theform of degree of confidence in network survivability versus cost of network imple-mentation. Choosing the right level of network availability and survivability relatesdirectly to the SP’s core business, since customers are becoming increasingly aware ofthe QoS for the services which they pay for [1,2].

1536-5379/07/060701-18/$15.00 © 2007 Optical Society of America

Page 2: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 702

The recently growing deployment volume of ethernet PONs (EPONs) [3], connectedwith low-cost electronic and optical components used in the optical network unit(ONU) modules, results in the situation where remote detection of faulty/active sub-scriber modules becomes indispensable for proper operation of an EPON system. Oneof the commonly indicated problems resulting from the application of low-cost elec-tronic laser module drivers is related to the accidental flooding of the upstream chan-nel with the cw transmission from one or more damaged ONUs, which under certainunfavorable conditions (ONU distance distribution and emitted power level) mayresult in the severance of the upstream channel transmission or at least the deterio-ration of the operating conditions for the whole system.

The remainder of this paper is organized as follows. Section 2 presents a briefdescription of the EPON, which is currently the most widely deployed flavor of PONtechnology [4]. Both downstream and upstream transmission channels are examinedin detail (Subsections 2.A and 2.B, respectively). The problem statement (detection ofa faulty ONU with the upstream channel flooded with the cw signal) is included inSubsection 3.A. Subsection 3.B presents the currently existing (patented) andproposed mechanisms for fault detection in PON systems, while Subsection 3.C exam-ines the physical effects occurring in the system flooded with a cw signal, withSubsection 3.D underlining the theoretical discussion from Subsection 3.C. The pro-posed fault detection protocol (FDP) is examined in detail in Subsection 3.F, with thedownstream and upstream channel components described separately inSubsections 3.F.2 and 3.F.3, respectively. The main conclusions are drawn inSection 4.

2. EPONs—Examples of PON TechnologyThe EPON is a point to multipoint (P2M) architecture, with a single central office(CO) delivering services to a number of residential and/or business customers, andthus all data transmissions in the EPON system are performed between the opticalline terminal (OLT) and the ONUs. The OLT is typically manufactured in the form ofa network card blade placed in a CO-based chassis, and always remains under thestrict control and supervision of the network provider. ONUs, on the other hand, arecommonly deployed as stand-alone boxes, and their exact location depends on thedeployment scenario adopted by the particular SP—home for fiber-to-the-home(FTTH), curb in fiber-to-the-curb (FTTC), and business office in fiber-to-the-business(FTTB) (see Fig. 1 for details).

The only active elements in the network structure are located at the ends of thetransmission channel (OLT and ONUs), though they have strictly different functional-

Fig. 1. Standard EPON deployment with various FTTx solutions: FTTB, FTTC, FTTH/mixed FTTH.

Page 3: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 703

ities. The OLT connects the optical access network to the metropolitan area network(MAN) or wide area network (WAN), typically termed backbone or long-haul network,while ONUs typically aggregate traffic streams from individual subscribers and pre-pare them for transmission toward the OLT. Additionally, ONUs employ packet-prioritization mechanisms or scheduling, enabling full QoS support and enforcementof service level agreements (SLAs) between the internet service provider (ISP) and theend subscribers.

2.A. Downstream Transmission in EPON SystemsIn the downstream direction (from the OLT toward the ONUs), ethernet packetsbroadcast by the OLT pass through a 1�N passive splitter combiner (PSC) or a PSCmodule cascade to reach the ONUs (see Fig. 2 for details). Each ONU receives a copyof every downstream data packet. The number of connected ONUs can vary typicallybetween 4 and 64, limited by the available optical power budget. According to the cur-rently valid standard (IEEE 802.3-2005, hereinafter referred to as 802.3), the typicalnumber of connected ONUs is defined as at least 16.

The downstream channel properties in a typical PON system make it a shared-medium network: Packets broadcast by the OLT are selectively extracted by the des-tination ONU, which applies simple packet-filtering rules based on MAC and LogicalLink IDentifier (LLID) addresses (802.3, clause 64). The downstream channel opera-tion is best depicted in Fig. 2, where packets destined to different end subscribers arefiltered out by the ONUs from the broadcast downstream data flow.

2.B. Upstream TransmissionIn the upstream direction (see Fig. 3 for details), from the ONU toward the OLT, theEPON operates in the multipoint-to-point (M2P) mode, where a number of connectedONUs transmit their data packets to a single receiver module located in the OLT.Moreover, since the PSC is a strictly directional device, individual ONUs are notaware of other ONUs’ transmissions in the upstream channel. The resulting connec-tivity is similar to the P2P architecture, where centrally managed access to theupstream channel (all ONUs belong to a single collision domain) allows for only a

Fig. 2. Downstream channel transmission in an EPON (P2M operation [broadcast] andLLID packet filtering). PAN stands for personal area network, LAN for local area net-work, and MDU for multidwelling unit.

Fig. 3. Upstream channel transmission in an EPON (M2P operation)—standard TDM-based channel sharing.

Page 4: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 704

single ONU at a time to deliver its pending packets. The upstream medium access isperformed via a Dynamic Bandwidth Allocation (DBA) algorithm, and ONUs in theirdefault state are not allowed to transmit any data unless granted specifically by theOLT. Data collisions are avoided, since the central OLT controller is always aware ofthe scheduled transmissions from individual ONUs, and thus can effectively grantaccess to the upstream channel to particular subscriber stations and assure that theirtransmissions will not collide. The only exception from this centrally managedupstream channel access scheme is the so-called discovery process (802.3, clause 64),where new and still not initialized ONUs are allowed to register in the EPON system.

A multiple access protocol is required in the upstream direction, since the EPONoperates in the P2M mode and every single ONU talks directly to the OLT. Acontention-based media access mechanism (similar to CSMA/CD[5]) is difficult toimplement, since in the typical network deployment ONUs cannot detect a collision atthe OLT (due to the aforementioned directivity feature of the PSC module). Simulta-neously, providing the PON architecture with a feedback loop leading to every singleONU is not economically feasible. Contention-based schemes have the drawback ofproviding a nondeterministic service, i.e., node throughput and channel utilizationmay be described only as statistical averages, and hence there is no guarantee of anONU getting access to the media in finite time. As a consequence, this type of accessprotocol is ill suited for delay-sensitive transmissions, such as video conferencing orVoice over Internet Protocol (VOIP). To introduce determinism in the frame delivery,different noncontention schemes based on request–grant mechanisms have been pro-posed [6–9].

3. Fault Detection Protocol for PON Systems3.A. Upstream Transmission Channel—Faulty ONU ProblemUnder normal circumstances, all data packets originating from active ONUs shouldtherefore reach the PSC node in a nonoverlapping manner, avoiding any collisions andarriving at the OLT receiver in a condition fit for detection and data recovery. How-ever, such a system design (currently specified within the 802.3, clauses 60 and 64) ishighly susceptible to hardware faults, including the following potential events.

• Device fault:

• Medium fault,• Physical layer (PHY) fault,• Media Access Control (MAC) fault.

• Network fault:

• Wire rip/bend,• Loose connector,• Dirt contamination.

Network-type faults are typically considered graceful since their occurrence termi-nates service either for a particular ONU (a fault in the given drop section), whileleaving other ONUs unaffected, or for the whole EPON branch, terminating connec-tivity for all subscribers. Either way, such a signal loss from at least one ONU is eas-ily detectable at the OLT level through the keep-alive mechanism, as defined in the802.3, and as such it is considered solved. Moreover, a network attacker cannot per-form targeted denial of service (DoS) on any subscriber equipment apart from his own,and thus network-type faults cannot be exploited for malicious activity.

Device-type faults are also mostly graceful, since both medium and MAC faults ren-der the particular ONU inoperable, with either bilateral or unilateral signal loss(bilateral signal loss is here considered as bidirectional interruption of informationflow, where the ONU is not capable of receiving any data from the OLT or of sendingany data towards the OLT. Unilateral signal loss leaves the ONU capable of receivingthe OLT signal with no way to answer back, or vice versa). Either scenario causes thegiven ONU to be disconnected from the EPON system through the keep-alive mecha-nism operation. The OLT typically detects such a situation and signals an ONU dis-connection event to the network administrator. It must be noted though that, at theOLT level, such an ONU failure cannot be distinguished from a simple ONU power-down event, and thus it is typically reported as a general ONU power-down state.

Page 5: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 705

However, one class of ONU device faults can result in a complete system service sev-erance and may be used for malicious DoS attacks on the EPON structure. A PHYlevel failure might result in an uncontrollable behavior of the laser diode driver cir-cuitry, leading either to catastrophic avalanche increase in the laser current (resultingin diode destruction through thermal effects) or to laser driver lock in either the on orthe off state. Provided that the laser driver gets locked in the off state, such a PHYfailure is similar to any other device type failure, and leads to signal loss at the OLTlevel and consequent ONU disconnection from the polling mechanism. However, alaser driver locked in the on state (cw operation mode, with no data stream modula-tion) leads to DoS for all ONUs in the system, as depicted in Fig. 4. Provided that thedamaged ONU has a sufficiently high power level and its emitted signal spectrum issufficiently broad to cover the transmission spectra of the other stations, no otherONU in the system will be able to transmit data in the upstream channel as long asthe disruptive signal is present. The most dangerous feature of this particular failuremechanism is that it might be easily used by a malicious person to spawn a completeDoS attack on the network structure—after all, it is enough to connect a laser sourceoperating in the proper upstream transmission window (1310 nm in the case ofEPONs), with sufficient emitted power and spectrum width.

3.B. Existing Failure Detection Systems for PON NetworksCurrently, network failures and most types of device failures are detectable onlythrough subscriber feedback (lack of connectivity), and are typically resolved throughONU replacement or by the visit of a qualified technician to the subscriber’s site, toexamine the ONU equipment in detail. There is no way to distinguish the ONU fail-ure from ONU power-down states, remotely (from the OLT level). Therefore, in thecase of a DoS event in the network caused by an ONU laser lock event and/or targetedmalicious activity, each ONU in the system must be checked by hand, since there areno mechanisms for remote damaged ONU identification under DoS conditions. Typicalcounter measures in such a situation include sending a team of technicians to all cus-tomers connected to the given xPON branch and testing their ONU equipment to iso-late the potentially damaged device. Since network downtime reduces QoS andbreaches SLAs with the customers, such fault isolation procedures must be rapid inorder to minimize the network inactivity period.

References [10,11] present a scheme allowing for PHY-level-based faulty ONU isola-tion, where a loopback method is used to confirm whether the given ONU operatescorrectly and responds properly to OLT transmitted control signals. Instead of bur-dening the OLT with the ONU state control tasks, a separate device termed loopbacktester is used, which employs a pseudonoise (PN) correlation detection technique todecide on the current state of the given ONU. The said tester module sends a controlcommand to the desired ONU to close the loopback circuit during measurement, andthe test signal with the PN pattern is subsequently transmitted to the ONU. TheONU reflects the test signal sent by the loopback tester to the upstream, and then thetester measures the autocorrelation function corresponding to the received opticalpower of the upstream light coming from the ONU. Additionally, the ONU implementsa low-speed optical modulation function for transmission of a low-data rate uniqueidentification (UID) stream, allowing for remote ONU identification at the OLT(tester) level. The said technique allows for remote ONU detection and identificationof faulty modules, but this added functionality requires significant ONU hardware

Fig. 4. Upstream channel transmission in an EPON (M2P operation), DoS from oneONU.

Page 6: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 706

changes in the form of an additional analog signal path switch and an additional OOKmodulator, which will transmit upstream the UID tag with adjustable data rate rang-ing from 5 to 200 bit/s. Moreover, the OLT side requires hardware level changes aswell, mainly in the form of a signal correlation module and a binary phase shift key-ing (BPSK) modulator to deliver downstream a PN bit stream. The said hardwarelevel extensions are neither cheap nor easy to implement in a cost-effective manner,and thus the commercial feasibility of the proposed mechanism remains questionable.

One of the possible solutions in terms of failure detection in the PON system maybe based on tagging each ONU in the system with a network unique tag, which istransmitted upstream toward the OLT module using either an alternative modulationon top of the typical intensity modulated nonreturn-to-zero (NRZ) signal or an har-monic frequency outside of the main signal transmission band. This mechanismresolves the ONU failure detection problem, with the laser being stuck in the enabledmode, though it requires hardware level modifications in the ONUs and the OLT.Each ONU must be equipped with a signal generating source, producing an harmonicsignal at a frequency outside of the main transmission band in such a manner thateach ONU is assigned a unique frequency band, based on which the OLT might iden-tify which ONU is transmitting at the moment. The OLT must therefore be equippedwith a frequency spectrum analyzer as well, as it has to maintain a frequency associa-tion map to be able to identify which ONUs are currently transmitting data in theupstream channel (see Fig. 5 for details on the internal ONU structure). If one of theONU laser drivers fails, the given transmission window will contain two frequencyidentifiers, and the OLT task is limited to isolating both components and checking thefrequency allocation table to see which ONU was scheduled for transmission andwhich was not, clearly identifying the faulty device. The obvious disadvantage of theproposed solution is directly related to the need of employing a complex frequencyallocation scheme, where each ONU must be assigned a network unique frequencyband.

3.C. ONU Failure Detection Under DoS Conditions—Physical BackgroundThis subsection describes the physical background effects occurring in the PON sys-tem, where the upstream channel is flooded with the cw signal transmitted by one ormore damaged ONUs. Since the PON system operation at the hardware level is thesame for all existing xPON solutions, the description is generic and thus in this sec-tion we try to abstract from any particular xPON solution. Additionally, it is irrel-evant, in terms of the focus of this paper, how the laser of the faulty ONU is driveninto the driver lock situation. Once there, the physical level phenomena occur asdescribed below.

The transmission effects along the optical fiber may be reduced to attenuation (sim-plified by disregarding dispersion due to operation in the proximity of the zero-dispersion wavelength, nonlinear effects), since the data rate of the optical signal isrelatively low (1.25 Gbit/s for EPONs, 1.244 Gbit/s for GPONs, and 622 Mbit/s forBPONs in the upstream channel). As a consequence, the information eye patternsfrom the different ONUs at the input of the OLT have negligible intensity distortion,which means that rectangular pulse shapes may be assumed in the detected signalwithout loss of generality.

As a direct result of at least one of the ONUs, connected to the PON structure,entering into the laser locked state, the DoS condition occurs due to an interference

Fig. 5. Internal structure of the ONU module capable of delivering network unique tagtransmitted in alternative frequency band or using other modulation format.

Page 7: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 707

effect between a modulated optical signal (originated from an ONU operating in astandard manner) and a cw signal (originated from the faulty ONU). The optical sig-nal at the input of the photodetector of the OLT, E�t�, is given by the following expres-sion:

E�t� = A1�t�exp�j�1t + j�1�t�� + A2 exp�j�2t + j�2�t��, �1�

where Ax, �x, and �x are the amplitude of the envelope, the angular frequency, andthe optical phase (mainly due to phase noise of the laser source) of signal x (x=1 or 2�,respectively. t stands for time. The first term on the right-hand side of expression (1)is the modulated signal of the ONU operating in the standard way, whereas the sec-ond term is the CW signal. The electrical current at the output of the photodetector,I�t�, may thus be described by

I�t� � �A1�t��2 + �A2�2 + 2A1�t�A2 cos���1 − �2�t + �1�t� − �2�t��. �2�

Under such circumstances, the interference can take two distinct forms: in-band orout-of-band. In the in-band situation, �1��2 (the central wavelength spacing for thetwo interfering signals lies within the OLT receiver bandwidth thus producing thebeating effect), and expression (2) is further simplified to

I�t� � �A1�t��2 + �A2�2 + 2A1�t�A2 cos��1�t� − �2�t��. �3�

In expression (3), the first term represents the detected modulated signal, the sec-ond term is proportional to the detected cw signal, and the third term results from thebeating between the modulated and the cw signals. The eye pattern at the output ofthe photodetector presents significant intensity distortion because of the presence ofthis third term. An example of such an eye pattern is presented in Fig. 6(a). The infor-mation of the modulated signal can still be recognized, although detection errors mayoccur due to the interference noise present in the higher level (digital “1”). A clock anddata recovery (CDR) unit can still lock to a signal similar to the one presented inFig. 6(a).

However, if one increases significantly the power of the cw signal, the detected eyepattern will close [12] and the CDR will lose lock. At this point, it is important toemphasize that the situation of strictly identical wavelengths between modulated andcw signals is highly improbable in the xPON scenario, because the lasers of the ONUsare independent and subject to different environmental conditions, since they areplaced at different locations. Additionally, in terms of xPON systems, the wavelengthsof the lasers used in the ONUs must be within a specific range (100 nm at approxi-mately 1310 nm wavelength) in order to comply with the values presented in 802.3Tables 60-3 and 60-6 for EPONs, and in Tables 2.d–2.f in the G.984.2 standard, forbroadband passive optical network (BPON) and gigabit passive optical network(GPON) systems.

In the case of the out-of-band interference, where �1��2 (the wavelengths of thetwo signals are significantly different from each other), the electrical current at theoutput of the photodetector must be modeled by expression (2). However, as the pho-

Fig. 6. Typical eye pattern for a case with (a) in-band interference and (b) out-of-bandinterference, as received in the continuous transmission mode at the OLT level.

Page 8: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 708

todetector is followed by electrical circuitry, a low-pass effect occurs and the thirdterm on the right-hand side is filtered out. As a consequence, the electrical currentmay be approximated by

I�t� � �A1�t��2 + �A2�2. �4�

As can be concluded from Eq. (4), the distortion occurs due to the presence of a con-stant value in the electrical current [second term in Eq. (4)]. Figure 6(b) presents thetypical eye pattern under such conditions. The eye pattern is undistorted (high andlow logical levels can be clearly identified), as only a constant value is added to thesignal. Under this situation, a CDR unit can easily lock onto the bit stream receivedfrom an ONU operating in the standard way.

3.D. ONU Failure Detection Under DoS ConditionsAn illustration of the theoretical discussion in Subsection 3.C was carried out by over-lapping two transmission signals in the 1310 nm transmission window, one with stan-dard ethernet framing modulation at 1.25 Gbit/s and the other one with no modula-tion at all (cw operation mode), and observing the retrieved signal on a digitaloscilloscope, clocked using a signal retrieved from the CDR circuitry. No burst modetransmission was simulated. The system setup schematic is depicted in Fig. 7. Two50/50 splitters–combiners were used in the setup system, first to overlay the signalsfrom two separate lasers and then to divide them into the CDR and oscilloscope signalpaths.

First, the out-band interference situation was examined in detail, with the interfer-ing signal transmitting at 1300 nm and the interfered signal at 1310 nm (see Fig. 8 fordetails). It is visible that, according to theoretical expectations, the only differencebetween the low and high interfering signal power conditions is the presence of a dccomponent and a change in the extinction ratio (7.0 and 5.9 dB for low and high powerinterfering signals, respectively). Since the CDR module is typically equipped with adc filter block, the only observable change in the signal quality is related to a decreaseof the extinction ratio. Otherwise, the received bit stream is characterized by an openeye and can be received properly at the OLT level.

To evaluate the power level ratio between the interfering and the interfered signals,the fiber was disconnected from the photodetector, linked to a power meter and then,sequentially, one of the inputs of the 50/50 coupler was disconnected. In this way, itwas possible to measure the interfering and interfered signal power levels indepen-dently.

Fig. 7. Optical system setup for the illustration of the theoretical discussion in Sub-section 3.C.

Fig. 8. Out-band interference with (a) very low interfering signal power �−27 dBm� and(b) relatively high interfering signal power �−11.3 dBm�. Both oscilloscope screenshotswere taken with the same settings.

Page 9: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 709

Next, the interfering signal source was aligned with the interfered signal and bothwere transmitting at 1310 nm with an alignment precision better than 0.5 nm (strictin-band interference). The following figures show oscilloscope screenshots along withthe measured signal power levels, both for interfering and interfered signals. Theimages were processed to remove only the oscilloscope frames and menus, which donot contain any useful information.

The first two images (Figs. 9 and 10) present the signal eye pattern with no inter-fering signal and with very low interfering signal power (−17.3 versus −11.6 dBm).Figure 9 depicts the original signal with clear eye pattern and with no interferinginterference. Figure 10 presents the evolution of the signal eye pattern along with theinterfering signal. Here the signal power level ratio was estimated at 5.6 dB in favorof the data carrying signal.

Figure 11 depicts the threshold situation, under which the CDR circuitry began los-ing clock recovery capabilities, the observed signal eye pattern is partially closed, witha signal power level ratio estimated at 1.8 dB (interfering signal power level at−13.4 dBm and interfered signal power level at −11.6 dBm). Finally, the signal eyepattern closed with an interfering signal power level above −13.2 dBm, as depicted inFig. 12. The CDR lost clock recovery capability, raising an alarm (clock loss), and theconnected network analyzer depicted a completely closed eye pattern.

Fig. 9. Signal eye pattern at the photodetector input with no interfering signal powerand interfered signal power level at −11.6 dBm. The CDR circuitry can recover the clocksignal.

Fig. 10. Signal eye pattern at the photodetector input, with interfering signal powerlevel at −17.3 dBm and interfered signal power level at −11.6 dBm. The CDR circuitrycan recover the clock signal.

Page 10: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 710

3.E. Applicability to Generic xPON Clock and Data Recovery Circuit ModulesCDR mixed-mode IC circuits typically integrate the functionalities of the analog clockrecovery and digital signal detection mechanisms, targeting the synchronization of alocal clock signal to the incoming bit stream (self-clocking signal). The obtained clockis then used to retrieve the data from the incoming bit stream, using the alreadyavailable correct timing data. Figure 13 depicts the internal structure (block diagram)of a CDR circuit, comprising a phase-locked loop (PLL) and frequency-locked loop(FLL), which in turns allow for synchronization of the voltage-controlled oscillator(VCO) to the incoming bit stream and low-frequency, local system clock.

The differences in the phase and frequency between the incoming data and VCOare monitored by the high-speed phase and frequency detectors and are fed back tothe VCO to correct the drift through charge pump circuitry. The recovered clock isthen used to retrieve the data with a retiming latch. A more detailed discussion of theCDR operation principles can be found in [13].

3.E.1. Basic Linear Phase Detector FunctionThe phase detector block compares the difference between the local clock and theincoming data, with the basic phase detector module using either an analog mixer ora digital exclusive or (XOR) logic circuit, as depicted in Fig. 14. The resulting dc com-ponent at the output of the linear phase detector module represents therefore thephase difference between two input clocks (the internal one and the one retrieved

Fig. 12. Signal eye pattern at the photodetector input, with interfering signal powerlevel at −12.9 dBm and interfered signal power level at −11.6 dBm. The CDR clock re-covery is impossible, eye pattern closed.

Fig. 11. Signal eye pattern at the photodetector input, with interfering signal powerlevel at −13.4 dBm and interfered signal power level at −11.6 dBm. The CDR clock isnear the recovery threshold, eye pattern almost closed.

Page 11: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 711

from the input data stream) and can be used to speed up or slow down the VCOthrough a low-pass filter and charge pump circuit.

3.E.2. Alexander Phase DetectorThe detected input binary signal is converted into a useful data stream at the outputof the CDR module, and digital Alexander phase detectors are typically used to imple-ment the CDR function for lightwave circuits. The operation principle is based on theutilization of a local internal clock to sample the incoming random binary data streamby taking three data samples per bit in specific places, namely, at the middle of the bitinterval before the clock transition [point A in Fig. 15(a)], at the clock transition (pointT), and at the middle of the bit interval just after the clock transition (point B). Onceobtained, these data samples will provide feedback for the digital CDR system, indi-cating whether the local clock is running too slow or too fast relative to the self-

Fig. 14. Basic linear phase detector function can be implemented with either (a) ananalog multiplier/mixer or (b) a digital XOR logic.

Fig. 13. Schematic diagram of CDR circuit.

Fig. 15. Timing diagram of an Alexander phase detector, which takes three sequentialsamples of the incoming random binary data A, T, and B taken before, at, and after thedata transitions timed by the local clock. (b) Signals generated from the combinatorylogic of A, B, and T to synchronize the local clock by speeding up or slowing down theVCO.

Page 12: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 712

clocking incoming data stream. Such a phase detector can be easily implementedusing a chain of four data flip-flops as the sample-and-hold mechanism to produce A,B, and T samples from the incoming data [see Fig. 15(b) for a schematic implementa-tion]. The data, when processed together, can be used to produce the UP–DOWN sig-nals, which then drive the charge pump circuit and VCO to synchronize the local clockto the incoming data. This way, both the clock and the data stream incoming to theCDR module can be recovered.

Despite the simplicity, temperature stability, and insensitivity to noise interference,the switching speed of the flip-flops and the combinatory logic (which has to be fasterthan the incoming data rate) present a significant limitation in terms of the achiev-able data rates for the incoming data streams.

3.E.3. Bang–Bang Phase Detection ArchitectureMany CDRs utilize multiple-phase docks at half the data rate of the system to run thephase detection logic at lower clock speed. As an example, let us consider a CDR withthe VCO, which produces four clocks at half the incoming data rate to drive four par-allel phase detectors at four clock transitions 90° apart. This so-called Bang–Bangphase detection architecture promises wider clock phase margin with the half-rateclock and can be implemented using less demanding IC technologies with lower tran-sistor performance. One of the implementation examples was reported by LucentTechnologies, comprising four digital phase detectors driven by four 20 GHz clocks,each with a quadrature phase difference. These quadrature phases are maintained bydividing down the 40 GHz clock with an on-chip 1:2 divider. The phase detection logicthen produces the up and down signals for the charge pump by examining six timingsamples generated by four latch chains during two data bit intervals. The retimeddata is then fed into a 1:4 demultiplexer to regenerate four 10 Gbit/s data channels,providing a combined 40 Gbit/s CDR detection capacity.

3.E.4. Commercial CDR Modules and Applicability of FDPIn terms of the FDP operation, the CDR module present in the OLT unit has to meetonly two basic conditions, namely it has to have two signal outputs (separate outputsfor clock data and retimed signal data) and additionally should not have free runningmode capability, meaning that once the clock signal is lost, the internal clock shouldnot be passed to the output pins of the circuit. Since the proposed mechanism requiresinformation about the successful attempt of the particular ONU, trying to deliverstored data frame in the upstream channel, regardless of the condition of the datachannel itself, the FDP module must thus observe at least the data output stream andcount the properly retimed data bits, which were recovered by the CDR from the inputserial data stream. As indicated previously, it is the presence of properly retrieved bitsthat allows the FDP module to tag the particular ONU as operating properly, thus theframing information is completely irrelevant. This means that the upstream channelsignal quality may be so low that the frames may arrive with such high bit error rate(BER) that their reconstruction is impossible, but in terms of the FDP operation thatstill means that the scheduled ONU was attempting transmission (despite DoS condi-tions in the uplink) and thus is operating correctly.

It is therefore our opinion that regardless of the internal CDR module operationand the implementation of the phase detectors and applied technology, the proposedFDP mechanism will operate correctly as long as the CDR does not operate in a free-running mode when no data is received at the serial input (most PON CDRs do notoperate in the free-running mode), and the CDR module outputs at least the recoveredand retimed data to its output ports (most PON CDR modules provide not onlyretimed data but also recovered clock for internal module reference).

3.F. ONU Failure Detection Under DoS ConditionsThe basic idea behind the proposed ONU failure detection mechanism is based on theaforementioned capability to detect the attempt of an active, fully operational ONU todeliver data frames upstream even under DoS conditions, when the upstream channelis flooded with the cw signal from one or more damaged lasers. For that purpose, theOLT creates a list of active ONU modules and maintains it throughout the whole net-work operation time, updating it when necessary (e.g., when a subscriber unit is

Page 13: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 713

newly registered or goes offline for any reason). When the DoS condition in theupstream channel is detected, the proposed mechanism is enabled.

First, the OLT prohibits all ONUs in the system from transmitting in the upstreamchannel by deregistering them in accordance with the respective standard. Next, fol-lowing a shortened discovery process with no ONU feedback in the upstream channel,which is considered unreliable at this point, the previously active subscriber units areregistered and prepared for scheduling. In the following step, the OLT grantsupstream channel transmission slots of a predefined size and with a well-knownorder, which allows us at the later stage to associate the observed ONU activity withthe particular MAC address of the device.

Once the scheduling stage is completed, the downstream channel is locked downand the OLT enters the reception stage, when the internal CDR circuitry registers areobserved to discover whether the scheduled ONU is trying to transmit information inthe previously assigned time slot or not. Should such a transmission attempt occur,the CDR circuitry would be able to detect transitions indicating alternating bit pat-terns, as described in Subsection 3.E. The lack of such transitions indicates either aONU power-down state, a general failure, where the electronic circuitry (e.g., laserdriver and chipset) is unresponsive or operates improperly, or location in the closevicinity of a damaged module.

To increase the detection probability, each ONU in the system may be polled severaltimes, by allocating the upstream transmission slots either in succession or once percomplete polling cycle. The final CDR internal transition register counters, associatedwith the particular MAC address of the subscriber units, indicate clearly which ONUsattempted delivery of data frames in the upstream channel and which failed to do so.The short duration of the ONU failure detection protocol makes it highly improbablefor one of the subscriber modules to be switched off in the time interval between thedetection of the DoS conditions in the upstream channel and the completion of theproposed mechanism. This allows us to assume that all the ONUs which were labeledinoperative at the end of the described protocol operation are indeed either broken orlocated in the vicinity of a damaged subscriber module, which limits the number ofONUs that need to undergo manual inspection.

3.F.1. Generic Protocol Description Applicable to all Existing xPON SystemsThe following description of the FDP is hereby applicable to any EPON/GPON/BPONsystem, regardless of the actual data rate in the upstream channel (in the case ofGPON/BPON systems, several data rates were standardized for both upstream anddownstream channel). It is assumed that the OLT has a PHY level BER meter, as wellas unrestricted access to the CDR internal registries, which can be stored in an arrayof the ONU unique serial numbers (where the ONU serial number can be its MACaddress, the GPON Serial_Number value), termed hereby ONU list, to be used asdescribed below. Additionally, it is assumed that the OLT can freely access any of itssoftware agents, including transmission scheduler and discovery process agent, whichare required for proper operation of the described protocol. In the scope of this section,a transition counter is a generic purpose counter capable of storing the number ofdetected CDR transitions as described in the PHY level model.

3.F.2. Downstream Transmission Channel—Detailed DescriptionIn the downstream direction, the OLT in the proposed mechanism is responsible forthe following series of actions (see Fig. 16 for details):

(1) Initialize the protocol by saving the current BER meter values. Prepare a validONU list, containing all properly registered and ranged ONUs, operating at themoment preceding the DoS event. The list is keyed with the ONU unique value, MACaddress in the case of EPON systems and the ONU Serial_Number variable in thecase of GPONs/BPONs.

(2) Prior to any specific protocol operation, all ONUs must be deregistered and stopupstream transmission. The deregistering process is carried out in compliance withthe respective standards defining the logical layer for the given type of PON network.As a result of this operation, all ONUs transit into unregistered state and wait for theregistration procedure.

Page 14: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 714

(3) The OLT initiates a shortened discovery process, where the ONUs will be regis-tered in the network structure while their feedback (e.g., as defined for the discoveryprocess in 802.3, clause 64) is ignored, since the upstream channel must be consideredunreliable at this particular stage. Since the time interval between occurrence of theDoS conditions and the start of the fault detection protocol is relatively short ��1 s�,the round trip time (RTT) times are assumed to maintain values within the tolerancerange required for proper operation of the scheduling mechanisms in the given PONinstance. For the same reason, it is also assumed that no ONUs will go offline duringthe aforementioned time interval, thus all subscriber units active and operationalbefore the DoS occurrence are assumed to be unconditionally operational after theevent occurs, with the exception of an unknown number of modules which sufferedfrom malfunction.

(4) The OLT transmission scheduling agent selects the start of the upstream trans-mission slot for each subscriber unit from the ONU list. While scheduling upstreamtransmission slots, care must be taken to maintain the minimum time interval ofRTTmax (typically 200 �s in EPON systems, which corresponds to the OLT-ONU dis-tance of 20 km; in the case of GPON systems supporting distances of up to 60 km, thisvalue must be increased appropriately to 600 �s) between the first scheduledupstream transmission slot and the OLT scheduler local time. This measure preventsreorganization of the ONU list order, when receiving upstream slots, as described inSubsection 3.F.3.

(5) To maximize the ONU transmission detection probability, each upstream trans-mission slot sequence shall be repeated a number of times �n�, defined by the networkadministrator during system setup. Under favorable conditions, only one upstreamtransmission slot sequence (containing one slot per ONU) is sufficient to detect activeand damaged modules. To increase the detection probability, the upstream transmis-sion slot scheduling process shall therefore be repeated n times, producing a series ofn consecutive and continuous upstream transmission slots. It must be noted here thatall allocated upstream transmission slots are of a fixed type and have a length of, e.g.,1 ms each (106 bits, 62,500 TQ as defined in 802.3, clause 64), and thus fit perfectlyinto a single GATE Multipoint Control Protocol Data Unit (MPCP DU) grant slot forEPON systems. Numerous individual windows might be granted sequentially to asingle ONU in GPON systems, allowing for longer transmission periods (a data frameis always 125 �s long for both 1.24416 Gbit/s and 2.48832 Gbit/s systems), and thusseveral frames can be assigned per cycle. The exact number of individual frames perONU per cycle is left to be decided upon by the network administrator. A transmissionperiod of 1 ms is argued to be sufficient for detection of any transmission attemptsfrom a functioning ONU.

Once a complete list of upstream transmission slot allocations is ready (containingat least one full transmission sequence for all ONUs included in the valid ONU list),the sequence of downstream slot allocation messages (GATE MPCP DUs in the case ofEPON systems, bandwidth allocation fields BWmap in the case of GPON systems, see802.3, Section 8.1.3.6) are delivered to the appropriate ONUs. Once properly receivedat the ONUs, the slot allocation messages are parsed and processed as in a normal

Fig. 16. Flow chart for OLT side operation of the proposed FDP: down-stream trans-missions only.

Page 15: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 715

operation mode. When all the slot allocation messages are safely delivered to the cor-responding ONUs the described protocol is considered to have completed its operationin the downstream direction.

3.F.3. Upstream Transmission Mechanism—Detailed DescriptionIn the upstream direction, the OLT in the proposed mechanism is responsible for thefollowing series of actions (see Fig. 17 for details):

(1) The OLT waits for the start of the first scheduled upstream transmission slot.The OLT obtains the scheduled ONU unique identifier (MAC address in the case ofEPON systems and Serial_Number in the case of GPON/BPON systems) from thevalid ONU list, resets the BER meter, and starts receiving data. Providing that thescheduled ONU does transmit data frames in the allocated slot (typically, subscriberpackets pending transmission), the CDR circuitry in the OLT will try to lock to thereceived signal, as discussed in detail in Subsection 3.C, producing a distorted bit pat-tern (the distortion degree depends on the actual cross-talk situation and power levelrelations between the interfered and interfering signals). As indicated previously, it issufficient to observe the CDR transition counter because we are not interested inreceiving fully formed data frames but rather in retrieving the data clock from theupstream transmission. This, as it was proven already in Subsection 3.C, can beassured under any conditions with out-of-band interference. Providing that the inter-fered and the interfering data channels have overlapping wavelengths, at least clockdata recovery can still be performed, even if individual packets cannot be retrievedfrom the upstream channel. This can be directly interpreted as an active ONUattempting transmission on top of the DoS signal originating from a different ONU.At the end of the transmission slot time, the current value of the CDR transitioncounter is stored in a separate array, keyed with the unique identifier of the currentlytransmitting ONUs.

(2) The valid ONU list is checked and, if more entries are found, the process movesback to step (2); otherwise, the valid ONU list pointer is reset and the next step istaken.

(3) This operation step is similar to the one described in step (2), with only one dif-ference: Since from now on consecutive upstream transmission slots will be receivedfrom the given ONU unique identifier, the previously initiated transmission attemptcounter at the start of the scheduled slot is not reset, but instead will be loaded from

Fig. 17. Flow chart for OLT side operation of the proposed FDP: upstream transmis-sions only.

Page 16: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 716

the respective array field, linked with the given ONU unique identifier. The rest of theoperations remains consistent with the data flow, as described in step (2).

(4) The valid ONU list is checked again, and, if more entries are found, the processmoves back to step (4), otherwise the number of upstream slots received and sched-uled is compared. If all scheduled upstream transmission slots were successfullyreceived, the process is terminated and the next (final) step is taken, otherwise, thevalid ONU list pointer is reset, and the process returns to step (4).

In this last step, the statistics for the FDP are produced. Any clearly active ONUsshall have a nonzero number of transmission attempts since they try to deliver datain the corresponding slot. Any inactive/disconnected/faulty ONUs shall have a zerotransmission attempts count.

It is also recognized that, under highly unfavorable transmission conditions, anactive ONU can be identified as an inactive one. However, it must be stressed that,depending on the quality of the OLT photodetector and the CDR circuitry, such a situ-ation can take place only if the following conditions are met simultaneously:

• The interfering and the target ONU central wavelength spacing [��1−�2� in for-mula (2)] must lie within the receiver bandwidth, thus producing the in-band crosstalk, otherwise, the out-of-band cross talk will occur, resulting in extinction ratio deg-radation rather than signal beating (as mentioned in Subsection 3.C).

• If the aforementioned condition is met, the interfering and the target ONU signalpower ratio is below 3 dB, meaning that the interfering signal power has to be lowerby 3 dB than the target ONU signal to assure proper operation of the proposed mecha-nism. Providing that strictly in-band cross talk occurs, the said 3 dB signal powerratio can be breached very easily by transmitting high power signal (with the launchpower at roughly −1¯ +1 dBm for IEEE 802.3ah compliant equipment). All the ONUsin real deployments feature similar link lengths; thus the channel loss figures areapproximately the same for all deployed devices. Slightly differing drop sectionlengths do not provide sufficient differentiation of the power budget to warrant theaforementioned 3 dB power margin. Only buslike topologies can provide better protec-tion against the in-band crosstalk. However, they are not considered a commonlydeployed topology as opposed to the standard tree-and-branch network architecture,generally identified with PON systems.

The short reach ONUs (e.g. IEEE 802.3ah PX10 systems) are typically equippedwith Fabry–Perot (FP)-type laser diodes, with highly unstable central wavelengths(depending on external temperature conditions, for example); it is very unlikely thatat any particular time any two ONUs will have exactly overlapping transmissionspectra. In the case of longer reach, higher-data-rate systems (e.g., International Tele-communications Union GPON compliant equipment), uncooled distributed feedbacklasers (DFB) are used, which have superior wavelength stability when compared withFP lasers. Nevertheless, the temperature wander effects for ONUs placed in differentlocations and subject to different environmental conditions will inherently result invarious central wavelength drifts. Under standard operation conditions (lack of theupstream channel feedback signals, accessible at the ONU level), it is technicallyunfeasible to monitor and track the wavelength of any other ONU. As a consequence,it is virtually impossible to carry out a targeted in-band cross talk attack by fine-tuning the laser source at the malicious ONU to cause beating interference at theOLT receiver. Even though it is argued that the in-band cross talk is highly unlikelyto occur in practical deployments, such a topic will be examined in more detail infuture work, to fully assess the technical feasibility of the proposed mechanism.

4. ConclusionsBased on the conducted lab experiments, we may conclude that the main systemoperation assumptions are correct and that the OLT CDR circuitry will be able torecover the clock signal under the presence of the interfering signal emitted by afaulty ONU, as a part of a DoS event, even under in-band interference (though withthe limitations discussed before). Since no complete OLT CDR block was available fortesting, we were not able to assess the operation of actual EPON/GPON compliantcomponents under the above referred conditions. Further testing with standardEPON/GPON components is therefore required to prove the applicability of the

Page 17: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 717

described protocol in practice. Additionally, the protocol layer mechanisms describedin detail in Subsection 3.F must be implemented on the OLT side, to allow for a cor-rect interpretation of the obtained error measurement results. It must be emphasizedthat the burst mode receiver module used in the OLT blade is typically characterizedby lower sensitivity in terms of signal and clock recovery capabilities (due to burstmode operation and clock recovery performed on frame preambles). Therefore theresults obtained with real EPON compliant hardware are expected to deteriorate, andthe exact proportions between interfering and interfered signal power levels need tobe assessed experimentally.

The proposed fault detection protocol is, to the best of our knowledge, the first andcomplete solution allowing for remote ONU identification and determination offaulty/suspicious ONU modules, even with the lack of upstream channel connectivity(cw transmission originating from at least one faulty subscriber device). As such, thedisclosed fault discovery protocol is an essential tool for faulty ONU location, relievingthe network administrator from the burden of manual testing all ONUs connected tothe given PON branch, by sending qualified technicians to subscriber sites. This inturns minimizes the network downtime resulting from the need to visit all ONU loca-tions in the system and test all the equipment at the customers’ sites. In this way, theQoS and SLA are less prone to be violated due to DoS events (deliberate or acciden-tal). The examined protocol is simple, robust, and completely based on the existingOAM/bandwidth allocation control messages, which were already standardized for allexisting xPON systems. As such, the said system takes advantage of already existingcomponents, requiring a minimum number of modifications at the PHY level (grant-ing access to CDR registers) and only firmware level upgrades, resulting in modifiedinterpretation of standard MPCP DU frames as well as novel data flow required forproper operation of the FDP mechanism. The proposed solution does not require anychanges in the respective PON standards. Both IEEE 802.3 ah and G.984 maintaintheir complete functionalities, and the proposed FDP mechanism extends the applica-tion of basic protocol mechanisms, resulting in an add-on mechanism. As such, theproposed mechanism might be eventually incorporated into respective EPON/GPONstandards as an option (in amendment), since no standard upgrade is required for itsimplementation.

AcknowledgmentsThe authors acknowledge financial support from Fundação para a Ciência e a Tecno-logia, Portugal, through the grant SFRH/BDE/15524/2004 and from Nokia SiemensNetworks S.A., Portugal.

References1. G. Kramer, “10 Gbps PHY for EPON—Call for Interest,” IEEE 802.3, CFI, 07-03-2006

(IEEE, 2006).2. T.-H. Wu, “Emerging technologies for fiber network survivability,” IEEE Commun. Mag. 33,

58–74 (1995).3. O. Gerstel and R. Ramaswami, “Optical layer survivability: a post-bubble perspective,”

IEEE Commun. Mag. 41, 51–53 (2003).4. IEEE 802.3, “Call For Interest: 10 Gbps PHY for EPON,” available at http://

www.ieee802.org/3/cfi/0306_1/cfi_0306_1.pdf(2006).5. C. Chang-Joon, E. Wong, and R. S. Tucher, “Optical CSMA/CD media access scheme for

Ethernet over passive optical network,” IEEE Photon. Technol. Lett. 14, 711–713 (2002).6. G. Kramer, B. Mukherjee, and G. Pesavento, “IPACT: a dynamic protocol for an Ethernet

PON (EPON),” IEEE Commun. Mag. 40, 74–80 (2002).7. G. Kramer, B. Mukherjee, and G. Pesavento, “Interleaved polling with adaptive cycle time

(IPACT): a dynamic bandwidth distribution scheme in an optical access network,” PhotonicNetwork Commun. 4, 89–107 (2002).

8. G. Kramer, A. Banerjee, N. K. Singhal, B. Mukherjee, S. Dixit, and Y. Ye, “Fair queueingwith service envelopes (FQSE): a cousin-fair hierarchical scheduler for subscriber accessnetworks,” IEEE J. Sel. Areas Commun. 22, 1497–1513 (2004).

9. M. Ma, Y. Zhu, and T. H. Cheng, “A bandwidth guaranteed polling MAC protocol forEthernet passive optical networks,” presented at the IEEE INFOCOM 2003, San Francisco,Calif. (IEEE, 2003).

10. Y. Horiuchi and N. Edagawa, “ONU authentication technique using loopback modulationwithin a PON disturbance environment,” presented at the OFC/NFOEC Optical FiberCommunication Conference (Optical Society of America, 2005).

Page 18: Fault discovery protocol for passive optical networks

Vol. 6, No. 6 / June 2007 / JOURNAL OF OPTICAL NETWORKING 718

11. Y. Horiuchi, K. Ohara, H. Tanaka, and M. Suzuki, “ONU diagnostic methodology of passiveoptical network in disturbance environment for physical layer security,” presented at theECOC (AEI, 2003).

12. I. T. Monroy, E. Tangdiongga, and H. de Waardt, “On the distribution and performanceimplications of filtered interferometric crosstalk in optical WDM networks,” J. LightwaveTechnol. 17, 989–997 (1999).

13. A. Buchwald and K. Martin, Integrated Fiber-Optic Receivers (Kluwer Academic, 1995).