recovery mechanisms in ason/gmpls networks · 2009. 8. 19. · ii l. velasco - recovery mechanisms...

Universitat Politècnica de Catalunya Optical Communications Group

Recovery mechanisms in ASON/GMPLS networks

Luis Velasco

Advisor: Dr. Salvatore Spadaro

Co-advisor: Dr. Jaume Comellas

A thesis presented in fulfillment of the requirements for the degree of

Philosophy Doctor

May 29, 2009

© 2009 by Luis Velasco All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the author. ISBN: 978-84-692-0809-0 Registration number 09/25118 Optical Communications Group (GCO) Universitat Politècnica de Catalunya (UPC) C/ Jordi Girona, 1-3 Campus Nord, D6-107 08034 Barcelona, Spain

ACTA DE QUALIFICACIÓ DE LA TESI DOCTORAL Reunit el tribunal integrat pels sota signants per jutjar la tesi doctoral: Títol de la tesi: Recovery mechanisms in ASON/GMPLS networks Autor de la tesi: Luis Velasco Acorda atorgar la qualificació de:

No apte Aprovat Notable Excel·lent Excel·lent Cum Laude

Barcelona, …………… de/d’….................…………….. de ..........…. El President El Secretari ................................... .................................. (nom i cognoms) (nom i cognoms) El vocal El vocal El vocal ................................... .................................. .................................. (nom i cognoms) (nom i cognoms) (nom i cognoms)

No puede resolverse un problema pensando de la misma forma que cuando fue creado.

Albert Einstein

Como no sabían que era imposible, lo hicieron.

Anónimo

Agradecimientos

Esta tesis doctoral no habría sido posible sin la ayuda y el apoyo de las personas que, durante los últimos años, han creído en mí.

Cuando aterricé en la UPC hace años, traía un extenso bagaje profesional, pero carecía de experiencia como investigador. No es fácil apostar por una persona de cierta edad que llega a la universidad con la pretensión de hacer el doctorado, y sin embargo, Gabriel Junyent me dio una oportunidad; desde entonces trabajo en el GCO y siempre que he necesitado su apoyo me lo ha dado.

Durante estos años, dos personas me han prestado su ayuda y me han dedicado su tiempo, haciendo gala de una inmensa paciencia; a ellos les debo esta tesis. Gracias, Salvatore y Jaume, los co-directores de esta tesis.

También le estoy inmensamente agradecido a la UPC, en la que desarrollo mi actividad docente y de investigación. Siento que ésta es mi casa.

Por último, aunque para mí son los primeros, quiero dedicar esta tesis a las personas que más quiero: mis padres Domingo y Faustina y mi hermana Elena. Ellos siempre han creído en mí.

Summary

This thesis is devoted to the study and physical implementation of different recovery mechanisms in GMPLS-controlled ASON networks. Under the standardization of the legacy SONET/SDH technology, several protection schemes were defined providing recovery times within 50ms. However, with the advent of optical networks providing automatic switching, the 50ms figure has been questioned; in our opinion due to the fact that this figure is so difficult to reach in transparent optical networks with the commercially available technology.

Three main objectives are undertaken in this thesis. First, we rely on the GMPLS control plane to bring protection schemes to the optical layer providing shorter recovery times. Therefore, we assume the 50ms figure as the recovery time objective. Moreover, flexibility is an important issue for network operators. It is important to provide different protection mechanisms allowing network planning to choose among several protection schemes, to implement the most appropriated to a particular network case. In this regard, the second objective of this thesis is to provide several protection mechanisms covering different requirements. Nevertheless, choosing a protection scheme at the planning phase may not be enough, being desirable some mechanism to choose the protection scheme in real time as a function of the current state of the network. That is thus our third objective.

Physical implementation of recovery mechanisms has been carried on building optical nodes, and the functional and physical designs of the nodes are described. Stringent performance requirements imposed notably optimization in its software and hardware architecture. Experimental results proved the efficiency of the node. Moreover, several advances have been done within the framework of the ASON/GMPLS CARISMA network test-bed. The new architecture of different modules at the three planes of the test-bed is described including some algorithms and mechanisms implemented. Experimental tests are carried on to obtain its performance. Firstly, a proposal to implement shared-path protection on rings is presented in detail. Then, an algorithm to compute disjoint routes under the wavelength continuity constraint is presented. Its performance is experimentally compared with a generally accepted RWA algorithm. Moreover, the fundamental

configuration time is experimentally obtained. Besides, two strategies for fault localization are presented and experimentally compared.

Two different recovery schemes are proposed and implemented Availability for every protection scheme is computed, and algebraic equations are derived. At the link layer, two complete solutions to build ring-based dynamic optical networks with link protection capabilities are proposed and evaluated. They consist of: 1) a novel GMPLS-based mechanism which coordinates the protection actions after failures, and 2) a new node design to support link protection. At the path layer, the shared path protection with extra-traffic scheme is implemented in ASON rings provided with a GMPLS control plane. The protection time provided by this scheme is analyzed as a function of physical optical components. We demonstrate that the switching time of the components prevents from protecting the complete set of affected connections within 50 ms after fault detection. This limitation can be solved by traffic differentiation. Finally, a mechanism to provide protection times under 50 ms to the complete set of connections is presented. It chooses the layer (link or path) in which the protection is performed as a function of the number of paths to protect.

3

I

Resumen

Esta tesis está dedicada al estudio e implementación física de diferentes mecanismos de recuperación de fallos en redes ASON/GMPLS. Durante la estandarización de la tecnología SONET/SDH, se definieron varios esquemas de protección que proporcionaban tiempos de recuperación inferiores a 50ms. Sin embargo, con la llegada de las redes ópticas de conmutación automática, la cifra de 50ms ha sido cuestionada, en nuestra opinión debido al hecho de su dificultad en redes ópticas transparentes con la tecnología comercialmente disponible.

Esta tesis acomete tres objetivos principales. Primero, confiamos en el plano de control GMPLS para trasladar esquemas de protección a la capa óptica. De esta forma, asumimos la cifra de 50ms como objetivo de tiempo de recuperación. Además, la flexibilidad es un factor importante para las operadoras, por lo que es importante poner a su disposición diferentes esquemas de protección de forma que la fase de planificación de la red pueda escoger el más adecuado a cada caso en particular. En este sentido, el segundo objetivo de esta tesis es proporcionar diferentes esquemas de protección que cubran diferentes requisitos. Sin embargo, elegir el esquema de protección de antemano puede no ser suficiente, siendo deseable algún mecanismo que tome esta decisión en tiempo real, en función del estado de la red en ese momento. Este es nuestro tercer objetivo.

Para la implementación física de los mecanismos de recuperación se han construido nodo ópticos reales, y por lo tanto, se describen los diseños físicos y funcionales de dichos nodos. Los requisitos de funcionamiento fueron muy rigurosos e impusieron

II L. Velasco - Recovery mechanisms in ASON/GMPLS networks

una notable optimización en su arquitectura software y hardware. Los resultados experimentales obtenidos prueban la eficiencia de los nodos. Además, se han realizado diversos avances dentro del marco de la red de pruebas ASON/GMPLS CARISMA. Se describe la nueva arquitectura de diferentes módulos en los tres planos de la red, incluyendo diversos algoritmos y mecanismos implementados. Se han llevado a cabo un conjunto de pruebas experimentales para obtener medidas de su funcionamiento. En primer lugar se presenta una propuesta para implementar protección compartida a nivel de camino. Después, se presenta un algoritmo para calcular rutas disjuntas bajo la restricción de continuidad de longitud de onda. Se ha comparado de forma experimental con otro algoritmo RWA generalmente aceptado. A continuación, se ha obtenido el tiempo de configuración de los nodos ópticos. Finalmente, se presentan y se comparan de forma experimental, dos estrategias de localización de fallos.

En esta tesis, se proponen e implementan dos esquemas de recuperación diferentes. Se ha calculado la disponibilidad de cada uno de los esquemas y se han derivado ecuaciones algebraicas. En primer lugar se proponen y evalúan dos soluciones completas para construir redes ópticas dinámicas basadas en anillos, con capacidades de protección de enlace. Ambas soluciones consisten de: 1) un nuevo mecanismo en el plano de control GMPLS que coordina las acciones de protección después de un fallo, y 2) un nuevo diseño de nodo óptico que soporta protección de enlace. En segundo lugar, se implementa, en la capa de camino óptico, el esquema de protección compartida con tráfico extra, sobre redes ASON en anillo provistas de un plano de control GMPLS. Se analiza el tiempo de protección proporcionado por este esquema, en función de los componentes ópticos físicos y se demuestra que el tiempo de conmutación de estos componentes impide proteger el conjunto completo de conexiones afectadas, dentro de los 50ms siguientes a la detección de un fallo. Esta limitación puede solucionarse diferenciando el tráfico en dos clases. Finalmente, se presenta un mecanismo para proporcionar tiempos de protección por debajo de los 50ms al conjunto completo de protecciones, seleccionando la capa (enlace o camino) en la que se realiza la protección, en función del número de conexiones a proteger.

3

I

Table of Contents

Page

Chapter 1 Introduction ............................................................ 1

1.1 Motivation and objectives ............................................................................... 1

1.2 Thesis Outline ................................................................................................. 2

Chapter 2 Intelligent Optical Transport Networks .......... 5

2.1 Optical Transmission Technology .................................................................. 5

2.2 Network topology ............................................................................................ 7

2.3 Network model ................................................................................................ 8

2.4 The ASON/GMPLS paradigm ........................................................................ 9

2.4.1 Automatically Switched Optical Network ............................................. 10

2.4.2 Generalized Multiprotocol Label Switching ......................................... 11

2.4.3 Optical Supervisory Channel ................................................................. 12

2.5 Summary ....................................................................................................... 13

Chapter 3 Recovery mechanisms at the optical layer .... 15

3.1 Availability and Service Level Agreements ................................................ 15

3.2 Recovery Schemes ......................................................................................... 17

3.3 OMS Protection Mechanisms ....................................................................... 19

3.3.1 OMS Dedicated Protection Ring (OMS DPRing) .................................. 19

3.3.2 OMS Shared Protection Ring (OMS SPRing) ....................................... 20

II L. Velasco - Recovery mechanisms in ASON/GMPLS networks

3.3.3 OMS protection in mesh networks ........................................................ 21

3.4 Path Protection Mechanisms ....................................................................... 24

3.5 State-of-the-art ............................................................................................. 25

3.6 Summary ....................................................................................................... 28

Chapter 4 Availability Studies ............................................. 31

4.1 OMS protection schemes in rings ................................................................ 32

4.2 OMS protection schemes in mesh networks ................................................ 34

4.3 Path protection .............................................................................................. 37

4.4 Summary ....................................................................................................... 39

Chapter 5 The CARISMA Network Test-Bed .................... 41

5.1 The Transport Plane ..................................................................................... 42

5.1.1 sROADM design ..................................................................................... 42

5.1.2 OXC emulator ......................................................................................... 48

5.1.3 Management interfaces .......................................................................... 48

5.2 The Control Plane ......................................................................................... 49

5.3 The Request Generator ................................................................................. 53

5.4 Summary ....................................................................................................... 55

Chapter 6 Shared-Path Protection...................................... 57

6.1 Shared-path Protection (SPP) in rings ........................................................ 57

6.2 Routing and Wavelength Assignment (RWA) ............................................. 59

6.3 SPP with Extra-traffic in ASON/GMPLS Rings ......................................... 64

6.3.1 SPP with extra-traffic Implementation ................................................ 65

6.3.2 Performance Evaluation ........................................................................ 67

6.4 Summary ....................................................................................................... 68

Chapter 7 ROADM Design and Protection Time Model for SPP .................................................................... 71

7.1 ROADM Design ............................................................................................. 71

7.2 Fault localization .......................................................................................... 72

List of Figures III

7.3 Protection Time Model .................................................................................. 75

7.4 Performance Evaluation ............................................................................... 77

7.5 Summary ....................................................................................................... 79

Chapter 8 OMS Protection in ring-based networks ........ 81

8.1 GMPLS-controlled OMS protection ............................................................. 82

8.1.1 The GAPS mechanism ........................................................................... 82

8.1.2 GAPS LMP extensions definition .......................................................... 87

8.1.3 Protection time Models for GAPS-controlled OMS protected rings ..... 87

8.2 ROADM Design to Support OMS Protection Schemes ............................... 90

8.3 Node implementation and evaluation ......................................................... 91

8.4 OMS Experimental results ........................................................................... 95

8.5 Summary ....................................................................................................... 97

Chapter 9 OMS – SPP real-time mechanism ..................... 99

9.1 Shared-Path Protection and OMS Shared Protection ................................ 99

9.2 Summary ..................................................................................................... 103

Chapter 10 Closing Discussion ............................................. 105

10.1 Main Contributions ..................................................................................... 105

10.2 Publications ................................................................................................. 106

10.2.1 Journals ................................................................................................ 106

10.2.2 Conferences and workshops ................................................................. 106

10.3 National and European Research Projects ................................................ 108

10.4 Topics for Further Research ....................................................................... 109

10.4.1 Dynamically Managed Differentiated Services .................................. 109

10.4.2 MPLS-GMPLS interconnection ........................................................... 109

List of Acronyms ...................................................................... 111

References ................................................................................. 115

3

V

List of Figures

Page

Fig. 2-1 WDM Technology ............................................................................................. 6

Fig. 2-2 Optical nodes and topologies ........................................................................... 7

Fig. 2-3 Example of a layered optical network architecture ....................................... 9

Fig. 2-4 The ASON architecture ................................................................................. 11

Fig. 2-5 OCC functional blocks ................................................................................... 12

Fig. 2-6 Optical Supervisory Channel (adapted from [Gr04])................................... 13

Fig. 3-1 Protection and restoration schemes (adapted from [ZhMu04]) .................. 18

Fig. 3-2 An OMS DPRing transporting one LSP, a) before and b) after a failure in

the link B-C (adapted from [Ar00]) ............................................................................ 20

Fig. 3-3 An OMS SPRing transporting two LSPs, a) before and b) after a failure in

the link B-C (adapted from [Ar00]) ............................................................................ 20

Fig. 3-4 LSPs transported with the OMS SPRing scheme ........................................ 21

Fig. 3-5 a) A p-cycle network, b) the same network after a failure in an on-cycle

link, and c) in a straddling link. ................................................................................. 22

Fig. 3-6 Examples of OMS Dp-cycles and OMS Sp-cycles networks. ........................ 23

Fig. 3-7 End-to-end vs. segment protection ............................................................... 24

Fig. 3-8 Shared-path protection.................................................................................. 25

Fig. 4-1 LSP unavailability in OMS schemes in metropolitan rings ........................ 33

Fig. 4-2 LSP unavailability in OMS schemes in long-haul rings ............................. 33

Fig. 4-3 Double link failure in a OMS SPRing .......................................................... 34

Fig. 4-4 Increasing LSPs unavailability by adding straddling links. ....................... 35

VI L. Velasco - Recovery mechanisms in ASON/GMPLS networks

Fig. 4-5 Comparing LSP unavailability in OMS Dp-cycles and in OMS DPRing

networks. ..................................................................................................................... 36

Fig. 4-6 Unavailability of LSPs in OMS p-cycles networks. ..................................... 37

Fig. 4-7 LSP unavailability. ........................................................................................ 39

Fig. 5-1 The CARISMA network test-bed .................................................................. 42

Fig. 5-2. sROADM functional design .......................................................................... 43

Fig. 5-3. Unidirectional ring with three sROADMs .................................................. 44

Fig. 5-4. Traffic patterns ............................................................................................. 45

Fig. 5-5. 2x2.5 Gbit/s Transponder card ..................................................................... 45

Fig. 5-6. Different uses of the transponder card ........................................................ 46

Fig. 5-7 sROADM physical layout and building blocks ............................................. 47

Fig. 5-8 Physical layout of OSNL, Transponder and Master cards and test-bed. ... 47

Fig. 5-9 The Link Resource Manager ......................................................................... 50

Fig. 5-10 The OXC Model: TE-links, data-links, and CPs ........................................ 51

Fig. 5-11 Unprotected SNC model .............................................................................. 51

Fig. 5-12 Protected SNC model .................................................................................. 52

Fig. 5-13 The Routing Controller ............................................................................... 53

Fig. 5-14 The Connection Controller Architecture .................................................... 53

Fig. 6-1 Dedicated protection and shared-path protection in optical rings. ............ 58

Fig. 6-2 Trap Topology ................................................................................................ 59

Fig. 6-3 Shortest Disjoint Path Pair algorithm ......................................................... 60

Fig. 6-4 Breaking down the network graph ............................................................... 60

Fig. 6-5 Example of a network represented by three graphs. ................................... 62

Fig. 6-6 Performance of PC-RWA vs. FF. ................................................................... 63

Fig. 6-7 Blocking probability against SP traffic load. ............................................... 67

Fig. 7-1 OADM design to support SPP with extra-traffic. ........................................ 72

Fig. 7-2 a) Additional hardware needed for the Optical pilot tone and b) for LMP-

based fault localization. .............................................................................................. 73

Fig. 7-3 LMP Failure localization ............................................................................... 74

Fig. 7-4 SPP with extra-traffic time model. ............................................................... 76

Fig. 7-5 Protection times against switching time ...................................................... 78

Fig. 7-6 Protection times against the number of LSPs to protect ............................. 78

List of Figures VII

Fig. 8-1 Actions performed by the switching nodes. .................................................. 82

Fig. 8-2 GAPS-controlled OMS DPRing under normal conditions. .......................... 83

Fig. 8-3 OMS DPRing after a LoL detection. ............................................................. 84

Fig. 8-4 Failures management: GAPS messages ....................................................... 85

Fig. 8-5 GAPS mechanism: Finite State Machine ..................................................... 86

Fig. 8-6 A failure in a bidirectional link is detected by its adjacent nodes. ............. 87

Fig. 8-7 Protection time for OMS DPRings ................................................................ 89

Fig. 8-8 Protection time for OMS SPRings ................................................................ 89

Fig. 8-9 Optical Nodes design to support OMS DPRing scheme. ............................. 90

Fig. 8-10 Optical Node design to support OMS SPRing scheme .............................. 91

Fig. 8-11 Optical Node internal architecture ............................................................. 92

Fig. 8-12 Node Agent architecture ............................................................................. 93

Fig. 8-13 Node time experiment ................................................................................. 94

Fig. 8-14 Experimental 2*tnode +tswitch time ................................................................. 94

Fig. 8-15 Experimental 2*tconfig+tswitch value ............................................................... 95

Fig. 8-16. Evolution of protection time with the number of nodes ........................... 96

Fig. 8-17. OMS DPRing Protection time for rings with 18 nodes ............................. 96

Fig. 8-18. OMS SPRing Protection time for rings with 18 nodes ............................. 97

Fig. 9-1 OADM design supporting both SPP and OMS SPRing. ............................ 100

Fig. 9-2. All distinct LSPs routed through link 3-4. ................................................ 101

Fig. 9-3. Spare capacity used (dl) against the number of protected connections (r).

When r≤11 SPP is applied, or else OMS SPRing is applied. ................................... 102

Fig. 9-4. Spare capacity used against the number of protected connections ......... 102

Fig. 10-1. Example of two MPLS islands connected through one ASON/GMPLS

domain. ...................................................................................................................... 110

3

IX

List of Tables

Page

Table 3-1 MTTF and MTTR typical values ................................................................ 16

Table 3-2 Protection Schemes ..................................................................................... 19

Table 3-3 Comparison of protection architectures (adapted from [Ar00]) ............... 26

Table 4-1 Set of links definition ................................................................................. 31

Table 4-2 Protection schemes availability comparison ............................................. 40

Table 5-1 Example of XML command ....................................................................... 49

Table 5-2 Classes of traffic Definition ........................................................................ 55

Table 6-1 PC-RWA Algorithm .................................................................................... 61

Table 7-1 Localization times ....................................................................................... 75

Table 7-2 Experimental times .................................................................................... 77

Table 7-3 Classes of Protection (CoP) ........................................................................ 79

Table 8-1 Information Transported by GAPS Messages ........................................... 83

Table 8-2 GAPS Object Format .................................................................................. 87

Table 8-3 Comparison of OMS solutions .................................................................... 97

3

1

Chapter 1

Introduction

1.1 Motivation and objectives

Cable cuts and equipment failures can cause communication services blackout for hours, days, or even weeks. The increasing use of telecommunications networks for business transactions makes recovery more important than ever before.

The impact of a failure has dramatically increased over the last decades as telecommunications networks have evolved from low capacity PDH point-to-point systems, transmitting at 565 Mbit/s, to today’s high capacity optical networks, transmitting more than 1 Tbit/s per fiber.

The authors in [ToNe94] estimate 2.72 cuts/year/1000 Km. National-wide networks include several thousands of installed route kilometers, which implies more than one cut per day on average.

Cable cuts are mainly caused by cable dig-ups [Cr93], which suggests that cable cuts may be more frequents in metropolitan areas where digging activities are generally carried on. In fact, other studies estimate than metropolitan networks experience annually 8 cuts for every 1000 km of fiber, and long haul-networks experience 1.8 cuts for 1000 km of fiber.

With the standardization of the SONET/SDH technology, in the 90s, the main protections schemes were defined providing recovery times within the mythic figure of 50ms. However, with the advent of optical networks providing automatic switching, the 50ms figure has been questioned; in our opinion due to the fact that this figure is so difficult to reach in transparent optical networks with the commercially available technology. Note that the advantage of the SONET/SDH technology over the optical technology is that signals are electronically processed node-by-node, whereas in transparent optical networks signals are send end-to-end

2 L. Velasco - Recovery mechanisms in ASON/GMPLS networks

without intermediate processing. Nonetheless, the ASON/GMPLS technology, providing a control plane, addresses that issue.

In this thesis we rely on the GMPLS control plane to bring protection schemes to the optical layer providing shorter recovery times. Therefore, we assume the 50ms figure as the recovery time objective.

Moreover, flexibility is an important issue for network operators. Therefore, it is important to provide different protection mechanisms allowing network planning to choose among several protection schemes, to implement the most appropriated to a particular network case. In this regard, another objective of this thesis is to provide several protection mechanisms covering different requirements.

Nevertheless, choosing a protection scheme at the planning phase may not be enough, being desirable some mechanism to choose the protection scheme in real time as a function of the current state of the network. That is thus our third objective.

1.2 Thesis Outline

The remainder of this thesis contains a walk through the relevant background material, followed by availability studies of some important protection schemes. The central chapters are devoted to the implementation and experimental performance evaluation of several GMPLS-controlled recovery mechanisms. Finally, the concluding discussion of the key advances in the field of network recovery, the list of publications, and some future research lines are provided.

Chapter 2 is a brief introduction to intelligent optical transport networks. Optical transmission technology, network topologies, and the concept of transparent are first described. Then, the ASON/GMPLS paradigm and a generally accepted network model are presented.

Chapter 3 is a survey on recovery mechanisms. Firstly, two concepts somehow related with recovery are presented: availability and service level agreements. Then, the most commonly terms used in recovery are introduced and a general classification of recovery mechanisms is presented. Moreover, different mechanisms for ring and mesh topologies are introduced. Finally, a review of the state of the art in this field is presented.

Chapter 4 is devoted to availability studies. Availability for every protection scheme presented in Chapter 3 is computed, and algebraic equations are derived. This chapter is based on already published material ([COMPNW08, ONDM08-1, JON09]).

Chapter 5 describes the advances done within the framework of the ASON/GMPLS CARISMA network test-bed. The functional and physical design of a physical optical node is described, and the new architecture of different modules at the

Chapter 1 - Introduction 3

three planes of the test-bed is described. Some parts of this chapter are based on [ICTON06, WGN07, JTI+D07].

In Chapter 6 the shared path protection with extra-traffic scheme is implemented in ASON rings provided with a GMPLS control plane. Firstly, a proposal to implement shared-path protection on rings is presented in detail. Then, an algorithm to compute disjoint routes under the wavelength continuity constraint is presented. Its performance is experimentally compared with a generally accepted RWA algorithm. This chapter is based on already published material ([JON09]).

Chapter 7 presents the ROADM design to support SPP with extra-traffic. Two strategies for fault localization are presented and experimentally compared. The protection time provided by this scheme is analyzed as a function of physical optical components. We demonstrate that the switching time of the components prevents from protecting the complete set of affected connections within 50 ms after fault detection. This limitation can be solved by traffic differentiation. This chapter is based on already published material ([ECOC08, ICTON08-2, JON09]).

In Chapter 8 two complete solutions to build ring-based dynamic optical networks with OMS protection capabilities are proposed and evaluated. They consist of: 1) a novel GMPLS-based mechanism which coordinates the protection actions after failures, and 2) a new node design to support OMS protection. Stringent performance requirements imposed careful optimization in the software and hardware architectures of the node. Experimental results, carried on in this chapter, proved the efficiency of the node. This chapter is based on previously published material ([ICTON07, ECOC07, DRCN07, COMPNW08]).

In Chapter 9 a mechanism to provide protection times under 50 ms to the complete set of connections is presented. It chooses the layer (OMS or LSP) in which the protection is performed as a function of the number of LSPs to protect. This chapter is based on previously published material ([ICTON08-1, JON09]).

Finally, Chapter 10 summarizes the work and draws the man conclusions. It also lists the papers published in this thesis, and presents future research lines to continue developing this topic.

3

5

Chapter 2

Intelligent Optical Transport Networks

An optical network is a network composed by optical nodes which are connected using optical fibers. In these networks the information is transmitted as an optical signal. In the network nodes, the signal can be treated optically or converted to the electrical domain.

This chapter introduces basic concepts and terminology that are relevant to the work presented in this thesis. We start by introducing the optical technology and the basic optical nodes and topologies. Then, we model the optical network as a layered network defining reference points and connections. This is followed by an overview of the ASON/GMPLS paradigm.

2.1 Optical Transmission Technology

The Wavelength Division Multiplexing (WDM) technology allows transmitting different data flows on different optical wavelengths. Most WDM systems currently use the frequency region around 1550 nm, because this is one of the frequency regions where the signal attenuation reaches a local minimum. Fig. 2-1 shows an example of the WDM technology.

WDM systems with channel spacings ranging from 12.5 GHz to 100 GHz have been specified [G.694.1]. With that technology, the number of optical wavelength channels being multiplexed onto a single fiber ranges 50-400. When such a large amount of channels can be transported by the WDM system, the term Dense Wavelength Division Multiplexing (DWDM) is used, in contrast to Coarse Wavelength Division Multiplexing (CWDM), which is considered for the


metropolitan network and multiplexes a limited number of wavelengths onto a single fiber.

Optical Fiber

Wavelengths(λ)

WDMsignal

De/Multiplexer De/Multiplexer

WDMsignal

Wavelengths(λ)

Data flows Data flows

Fig. 2-1 WDM Technology

Light emitters (usually semi-conductor lasers) are key components in any optical network. They convert the electrical signal into a corresponding light signal, on a single wavelength, that can be injected into the fiber. Besides, a DWDM system uses a multiplexer at the transmitter to multiplex the different wavelengths together in a bundle, and a demultiplexer at the receiver to split them apart. An optical fiber transmit optical signal through long distances. However, the power of the signal is reduced when it propagates over distance, this is called attenuation. The receiver sensitivity indicates the minimum power required to detect the incoming signal. In order to compensate for the effect of attenuation, the optical signal can be amplified within the optical domain.

When the optical signal travels through an optical fiber it is also distorted by the effect of dispersion, which modifies the optical pulse duration. This may lead to inter-symbol interference. Two important types of dispersion can be compensated: the Chromatic Dispersion (CD) and the Polarization Mode Dispersion (PMD). Regenerators allow compensating for dispersion, by converting the optical signal to the electrical domain.

Today, on a typical 40 channel DWDM system transporting 10Gbit/s per wavelength, the maximum distances that can be transmitted without regeneration are about 2000 km [PePe04]. Moreover, new modulation formats, currently in a pre-commercial phase, will increase both bandwidth and distance [Wi08].

SONET/SDH technologies, and more recently Optical Transport Network (OTN) [G.709], standardize transmission frame formats, including a set of overhead bytes. Although paths are end-to-end, section overhead bytes are ended and processed at every intermediate node. Adjacent nodes communicate each other using those bytes. On the contrary, in all-optical or transparent networks optical connections are established between end nodes, assigning them a specific optical channel, without any intermediate electronic processing. In such transparent optical networks, a node cannot communicate with any adjacent.

Chapter 2 - Optical Transport Networks 7

2.2 Network topology

The first DWDM systems were point-to-point systems. The introduction of Optical Add/Drop Multiplexer (OADM) in the transport networks allows them to be configured in ring-based topologies, similar to traditional synchronous digital hierarchy (SDH) [G.707] networks1. An OADM allows dropping a specific wavelength out of the bundle of DWDM-multiplexed signals, and adding another channel on the same wavelength (Fig. 2-2a).

Optical Add DropMultiplexer (OADM)

Ring Networks

OpticalCross-Connect

(OXC)

EastWest

Access Ports

Mesh NetworksRing Networks

a)

b)

Fig. 2-2 Optical nodes and topologies

The introduction of sophisticated optical devices such as Wavelength Selective Switches (WSS) [TsHu06] made possible to build evolved OADM architectures and optical cross-connects (OXCs), the key element to build optical mesh-based networks (Fig. 2-2b) [RoCo08].

The nodal degree (d) of a node is the number of links incident on the node (number of DWDM ports). In the rest of this thesis we use the term OADM for optical nodes with d = 2, and the term OXC for optical nodes with d ≥ 2.

Ring-based networks present lower capacity efficiency than mesh networks; mesh networking allows connections to be routed over shorter paths. In this regard, mesh-based networks have been extensively used in packet-based networks due to their high efficiency and flexibility. In practice, however, the cost per DWDM port in OXCs is much higher than in OADMs.

In general, a telecommunication network is typically split into two segments: the transport or core network and the access network. The transport network links the nodes in the important cities with each other. The access network is the part of the

1 The Synchronous Optical NETwork (SONET) technology, standardized by the American National Standards Institute (ANSI), is the U.S. counterpart of the SDH technology.


network allowing the individual customers to connect to the nearest network node. Between the transport and the access network, we can also find a metro network, expanding over a big city or an entire region.

This thesis focuses on optical transport networks, where mesh and ring topologies can be found together. Nonetheless, in the near future ring-based networks will remain more extensively deployed than mesh networks.

2.3 Network model

The ITU-T defines in [G.805] a reference layered transport network architecture with technology-independent relationships amongst functional entities. Therein, each network layer holds a twofold role, namely a server role to the client layer above it, as well as a client role to the network layer below it.

In brief, a sub-network describes the capacity to associate a set of connection points (CP) to convey the so-called "characteristic information". With such objective, two possible kinds of connection are defined. A link connection is a fixed and inflexible connection between two CPs. Conversely, a Sub-Network Connection (SNC) is a flexible connection that may be set-up and released by either the control or the management plane. As the result, a network connection is a concatenation of sub-network and link connections delimited by a Termination Connection Point (TCP) pair. Correspondences between ITU-T and IETF terminology can be found in [RFC-4394, RFC-4397].

Through the defined reference layered network architecture, the optical network can be modeled as a two-layered transport network. The first layer is represented by the DWDM TE links and optical network ports, whereas the second layer is represented by the different wavelength channel data links and access optical ports.

A two-layered 3-node all-optical network is shown in Fig. 2-3a. In such scenario, link connections associate CPs at remote neighboring nodes. Those link connection sets are bundled into network connections between remote TCPs, which respectively represent TE links and optical network ports.

Let us suppose now, that a path is set-up between ingress node A and egress node C (Fig. 2-3b). The incoming client signal at the optical node A is adapted and cross-connected by an SNC to an outgoing CP. This CP is, in turn, connected through a data link to an incoming CP in the neighbor. At the intermediate node B, a SNC binds incoming and outgoing CPs, which should be mapped to the same wavelength in case that no wavelength converter is used (wavelength continuity). As soon as the signal reaches the destination node C, this one is cross-connected, adapted and sent to the optical access port.


This network model will be used in following chapters to build specialized protection mechanisms at the path and at the link layer.

Link Layer

Data Links

TE Link

Data Links

TE Link

Node A Node B Node C

a)

LSPb)

Connection Point (CP)

Termination Connection Point (TCP)

Subnetwork Connection (SNC)

Path Layer

Link Layer

Data Links

TE Link

Data Links

TE Link

Node A Node B Node C

Path Layer

Fig. 2-3 Example of a layered optical network architecture

2.4 The ASON/GMPLS paradigm

Legacy transport networks, based on SONET/SDH technologies, were designed to be managed by a centralized system. The Network Management System (NMS) provided all the management functions and needed to process huge amount of information. Operators established connections manually in the transport network using the NMS. Those connections were stable in the network for a long period of time (months or even years). Nevertheless that scenario is rapidly changing; the traffic to be carried by today’s transport networks increases very rapidly, mainly due to the massive use of internet and multimedia applications.

The advent of flexible reconfigurable OADMs (ROADM) at a low cost in which the dropping and adding of wavelength channels can be remotely controlled, was very important. The main advantages of ROADMs are:

The planning of entire bandwidth assignment need not to be carried during initial rollout. The configuration can be done as and when required.

ROADM allows for remote operation and reconfiguration.

ROADM allows for automatic power balancing.


The introduction of intelligence in Automatically Switched Optical Networks (ASON) [G.8080] using a Generalized Multiprotocol Label Switching (GMPLS) control plane [RFC-3945] allows to setup, configure, and release optical connections, in a fast and dynamic way. Automating the network operations significantly reduces manual intervention and the involved costs for connection handling (OPEX). Network data, configuration commands, and acknowledgements are automatically created and exchanged by signaling and routing protocols. Client network layers (IP, SDH, etc.) can request optical connections through the optical network via the standardized User Network Interface (UNI) [RFC-4208].

The following three kinds of connections, differing in connection establishment type, can be distinguished in ASON [G.8080]:

permanent,

switched,

soft-permanent.

The permanent connection is set up either by a management system or by manual intervention and is also referred to as a provisioned connection. Therefore, such a connection does not require any intervention of the control plane and does not involve automatic routing or signaling. Usually, this is a static connection lasting for a relatively long time, such as months or years.

The switched connection is established on demand by the communicating end-points by using routing and signaling capabilities of the control plane. The switched connection requires an UNI interface and its set up may be the responsibility of the end user (the client network).

The soft-permanent connection is activated by the NMS triggering a command to the control plane. The relevant connection establishment is referred to as a hybrid connection set up. In this case no UNI is needed.

2.4.1 Automatically Switched Optical Network

An Automatically Switched Optical Network (ASON) [G.8080] is an optical transport network that has dynamic connection capability. This functionality is accomplished by using a control plane that performs routing, signaling and resource discovery.

The ASON architecture (Fig. 2-4) defines three different planes which exchange information through a set of defined interfaces:

The transport plane represents the functional resources of the optical network which convey user information between locations. It includes optical nodes and optical fibers, and is able to measure parameters to characterize the connections state, detecting failures, etc.


The control plane is in charge of the resource management, routing and connection signaling. The objective is to define an intelligent control plane able to create, modify and release connections automatically.

The management plane provides the network management functions (FCAPS): Failure management, Configuration management, Accounting, Performance management, and Security [M.3400].

Data

CCI CCI CCINMI-T

E-NNI

OCCNMI-A

OCC

OCC

User Signaling

UNI I-NNI

DCN

Optical Node Optical Node

Optical Node

Control Plane

Transport Plane

Clients

Management Plane

OCC: Optical Connection ControllerCCI: Connection Control InterfaceE-NNI: External Network-Network InterfaceI-NNI: Internal Network-Network InterfaceNMS: Network Management SystemUNI: User-Network InterfaceDCN: Data Communication NetworkNMI-A: Network Management Interface –

Control PlaneNMI-T: Network Management Interface –

Transport Plane

NMS

Fig. 2-4 The ASON architecture

2.4.2 Generalized Multiprotocol Label Switching

Within the IETF, the Common Control and Measurement Plane Working Group (CCAMP) [CCAMP] is leading the standardization of a framework known as Generalized Multiprotocol Label Switching (GMPLS) [RFC-3945]. GMPLS is a technology that provides enhancements to Multiprotocol Label Switching (MPLS) to support network switching not only at the packet level but also at time slot, wavelength, or even fiber level.

The terminology used by the IETF for the ITU term network connection is Label Switched Path (LSP). In the rest of this thesis we use the term LSP for the sake of brevity.

GMPLS provides a control plane in support to optical networking. GMPLS is based on Traffic Engineering (TE) extensions of:

the RSVP-TE signaling protocol [RFC-3209, RFC-3473], and

the intra-domain link-state OSPF-TE routing protocol [RFC-3630].

Moreover, the use of technologies like DWDM implies that we can now have a very large number of parallel links between two adjacent nodes (hundreds of


wavelengths, or even thousands of wavelengths if multiple fibers are used). To solve this issue the concept of link bundling was introduced. Moreover, the manual configuration and control of these links, even if they are unnumbered (links that do not have IP addresses), becomes impractical. The Link Management Protocol (LMP) [RFC-4204] was specified to solve these issues.

UNI

OCC

I-NNI CCI

RoutingController

Link ResourceManager

ConnectionController

CallController

Fig. 2-5 OCC functional blocks

The key element in the control plane is the Optical node Connection Controller (OCC). Fig. 2-5 shows the functional blocks of an OCC. The OCC architecture includes the components described in [G.8080]. Specifically, it contains:

the Call Controller, which is responsible for admission policies. Connection requests from clients arrive through the UNI interface [RFC-4208],

the Connection Controller component, which is responsible for connections set-up, modification and tear down. It exchanges RSVP-TE messages,

the Routing Controller component, which is responsible for routes computation. It exchanges OSPF-TE messages, and

the Link Resource Manager, which is responsible for local data-link and TE link resources. This component uses the Connection Controller Interface (CCI) interface to manage the local OXC and exchanges LMP messages.

In this thesis we assume transparent networks provided with a GMPLS control plane.

2.4.3 Optical Supervisory Channel

The Optical Supervisory Channel (OSC) is an additional optical signal used to carry information about the DWDM optical signal as well as remote conditions at the optical end nodes (Fig. 2-6). This is analogue to SONET/SDH DCC (or supervisory channel) [G.707]. Unlike the normal traffic wavelengths, the OSC is always terminated at intermediate amplifier sites, where it receives local information before retransmission.


OCC

OpticalNode

Wavelength (µm)

1,2 1,3 1,4 1,5 1,6 1,7

Atte

nuat

ion

(dB

/Km

)

1

2

3 1310nm(OSC)

Fig. 2-6 Optical Supervisory Channel (adapted from [Gr04])

The control plane can be configured out-of-fiber or in-fiber out-of-band. In the former, the information at the I-NNI interface uses external resources to the controlled transport plane, to be transmitted whereas in the latter, the control plane uses the OSC channel to be transmitted.

2.5 Summary

In this chapter the ASON/GMPLS paradigm has been presented. An ASON is an optical transport network which has dynamic connection set up/tear down capability. This functionality is accomplished by means of a control plane which carries out, among others, routing and signaling functions. GMPLS provides a suitable control plane for dynamic optical networks.

The optical transport network has been modeled as a layered network with two layers: the link and the path layer. Based on this network model, recovery mechanism differentiated and adapted to the link and the path layer, are presented in the next chapter.

3

15

Chapter 3

Recovery mechanisms at the optical layer

Failures at the optical layer are very important due to the high bandwidth available per wavelength and the number of wavelengths per fiber; a single failure usually implies important traffic losses. Fiber cuts resulting from, for instance, digging works or the failure of an individual transmitter or receiver are quite common.

In this chapter we define the term availability and present the way to compute it in the steady state. Besides, typical values for failure and repair times for both the optical transmitter and the optical cable are presented. Then, we introduce the most commonly used terms in resilience. Finally, differentiated schemes at the OMS and at the LSP optical layers are presented. Some of those recovery mechanisms are used in the next chapters of this thesis.

3.1 Availability and Service Level Agreements

This thesis focuses on survivable optical networks, and different recovery mechanisms are going to be compared. In that context, an important issue is the availability. Generally speaking, availability is the probability that a system x will be found in the operating state at a random time in the future. In particular, we study the availability of LSPs over optical networks when different recovery strategies are applied.

Steady state availability can be expressed as [Gr04]:


MTTRMTTF

MTTFDownTimeUpTime

UpTimeA

(3.1)

where:

MTTR: Mean time to repair, the expected time needed to repair the network component.

MTTF: Mean time to failure, the expected time to the next failure of the network component, following completion of the repair. MTTF is usually expressed in hours or in FITs, number of failures in 109 hours.

Related to this is unavailability U, the probabilistic complement of availability A:

AU 1 (3.2)

We consider that the system x is comprised of a number of components (or subsystems) subject to failure. In optical networks, links may be supported by common cables or conduits, and thus, links in the network may fail dependently [Gu07]. However, when deploying long-haul core networks connecting main cities, planning phase must address this issue choosing fibers supported by disjoint infrastructure. Thus, in this thesis we consider the links in the network as being mutually failure-independent. Therefore, since those components are mutually failure-independent, we can apply the following equations.

If the system is comprised of components in a series availability relationship then all components must be operating for the system to be available. For elements in series the overall availability function becomes:

i iAAs (3.3)

If the MTTR is much smaller than the MTTF ([Gr04]):

i ii i UAAs 1 (3.4)

For elements in parallel, the exact form for the system unavailability is:

i iUUs (3.5)

In this thesis we assume the figures presented in Table 3-1 for MTTF and MTTR [ToNe94, SVCo05].

Table 3-1 MTTF and MTTR typical values

Tx failure rate 10,867 FITs

Rx failure rate 4,311 FITs

Plug-replacement Equipment MTTR 2 hours

Fiber-optic cable MTTR 12 hours

Fiber-optic cable failure rate 311 FITs/Km

Chapter 3 – Recovery Mechanisms at the optical layer 17

Since distances in optical transport networks can be counted in hundred of Km, the system components with highest failure rate are optical cables. Therefore, in this thesis we focus our study on cable cuts analysis. The results can be easily adapted to equipment failure analysis.

To exemplify the concepts shown so far, let us define the availability of an unprotected LSP (UP). Firstly, let us define L as the set of links transporting the particular LSP. That LSP will be available only if all links transporting the LSP are available, and thus using equation (3.4), its availability can be accurately estimated as:

Li

iUP UA 1 (3.6)

where Ui is the physical unavailability of the ith link transporting the LSP.

Several client layer networks may be served by one optical transport network. Typically, SONET/SDH and IP/MPLS are the most extended client technologies. To allow an automated service delivery that is executed on a pure machine level, correct agreements and regulations in the form of detailed Service Level Agreements (SLAs) have to be negotiated between the network operators and customers. SLAs can be stricter depending on the requested class of service; nevertheless SLA breaches turn into revenue losses for carriers. SLAs may include:

Maximum LSP establishing time.

Maximum time to repair/recover from failures.

Maximum number of failures/month.

Service availability.

Error rate.

3.2 Recovery Schemes

The way to improve availability in optical transport networks is by means of recovery schemes. In general, resiliency is the capability of the network to continue in operation even when failures occur. Resiliency is provided by either protection or restoration mechanisms. The former is based on the replacement of a failed resource (e.g., a link or a LSP) with a pre-assigned backup resource; the latter is based on rerouting the data flow using spare capacity. In both mechanisms, protection (or spare) resources can be either dedicated, in which case the spare resource is dedicated to a single working LSP, or shared, in which case the same spare resource may be used to provide protection to multiple working LSPs. The shared protection scheme is typically more complex to implement and manage but consumes fewer resources than the dedicated approach.


Two different dedicated protection schemes have been proposed, known as 1+1 and 1:1 (more generally 1:N) protection. In 1+1 protection the source node transmits on both the working and the protection LSPs simultaneously. The destination keeps monitoring both LSPs, dynamically choosing the best performing signal. Therefore, if degradation of the signal is detected on the working LSP, the destination immediately switches to the protection LSP. In 1:1 protection the backup LSP is used to transmit low-priority, best effort traffic. Upon failure of the working LSP, both the source and the destination switch over the backup protection LSP, preempting the low priority traffic. Note that since we assume the best effort client traffic (e.g. IP traffic) being transported over the optical network by means of preemptable extra-traffic resources, the term best effort traffic is equivalent to the term extra-traffic.

Protection mechanisms at the optical layer can work both at the optical link layer (Optical Multiplex Section, OMS), and at the optical path (LSP) layer, also known as optical channel (OCh). Protecting at the OMS layer allows recovering the complete bundle of multiplexed optical channels in a fiber with only one protection action. When a link failure occurs, the two optical nodes adjacent to the failure loop back the bundle of working channels on the protection channels in the opposite direction. On the contrary, when protection is applied at the path layer, only the affected LSPs are protected.

Fig. 3-1 presents a general classification of protection and restoration schemes. Schemes can be classified either by resource sharing or in terms of the layer where the rerouting is performed.

Protection Restoration

Dedicated Shared OMS LSP OMS LSP

By resourcesharing

By rerouting

Fault recovery schemes

By rerouting

Fig. 3-1 Protection and restoration schemes (adapted from [ZhMu04])

In particular in a ring-based network, the most natural way to provide recovery is using a protection scheme. Here, another distinction can be made based on the direction in which the traffic is transmitted under normal working conditions. In a unidirectional ring, signals are always transmitted in the same direction on the ring, whereas in a bidirectional ring, signals are transmitted in both directions of the ring.


Table 3-2 presents a schematic view of the different protection schemes. A more detailed review of the existing resilience schemes can be found in [VaPiDe04, Gr04].

Table 3-2 Protection Schemes

Protection Scheme Characteristics.

OMS DPRing (OULSR) Dedicated protection, local recovery scheme performed by the OADMs adjacent to the failure.

OMS SPRing (OBLSR) Shared protection, local recovery scheme performed by the OADMs adjacent to the failure.

OCh DPRing (OUPSR) Dedicated protection, end-to-end recovery scheme performed by the OADMs on which the traffic enters/leaves the ring

OCh SPRing (OBPSR) Shared protection, end-to-end recovery scheme performed by the OADMs on which the traffic enters/leaves the ring.

DPP / SPP Similar to OCh DPRing/OCh SPRing, but applied also to mesh networks.

Legacy SONET/SDH transport networks are well-known for their inherently fast protection switching capability, which allows service recovery within 50ms after fault detection [G.841]. An interruption of 50 ms or less in a transmission signal is perceived by higher layers as a transmission error. At the IP layer it may cause a packet retransmission handled by TCP/IP, but no TCP/IP sessions will be affected at all. On VoIP applications, users do not perceive 100 ms outages [IaCh02]. A complete discussion about the 50 ms figure can be found in [Gr04]. As stated in Chapter 1, one of the objectives of this thesis is to bring at the optical layer protection schemes providing the shorter recovery times. Therefore, the 50ms figure is the reference time for the schemes proposed in this thesis.

3.3 OMS Protection Mechanisms

3.3.1 OMS Dedicated Protection Ring (OMS DPRing)

OMS DPRing consists of two counter-rotating unidirectional rings, each transmitting in opposite directions relative to the other (Fig. 3-2a). Only one fiber is dedicated for working traffic while the other is reserved for protection. Both flows of a bidirectional LSP are routed on the different sides of the ring, using the same wavelength. There is thus no possibility to reuse wavelengths on the ring for


different LSPs. Therefore, the maximum capacity that can be allocated on the ring is limited to the capacity of a single link.

When a link failure occurs, it is detected by the two optical nodes adjacent to the failure. Both nodes loop back the bundle of optical channels on the protection ring in the opposite direction (dashed lines in Fig. 3-2b). To perform and manage the efficient switching to the protection fiber, an Automatic Protection Switching (APS)-like protocol [G.841] is required.

F E D

A B C

F E D

A B C

WorkingProtection

Fig. 3-2 An OMS DPRing transporting one LSP, a) before and b) after a failure in the link B-C (adapted from [Ar00])

3.3.2 OMS Shared Protection Ring (OMS SPRing)

In OMS SPRing, the total capacity of each fiber is divided in two wavebands (B1, B2), as shown in Fig. 3-3a. One waveband on each fiber –B1 clockwise and B2 counter-clockwise– is reserved to transport working channels while the other is used to transport protection channels. Thus, in the OMS SPRing scheme, working and protection channels share each fiber. Working connections in one fiber are protected by the available capacity in the other fiber, in the opposite direction of the ring. This way no wavelength converters are needed when moving channels from working to protection bands.

F E D

B CA

F E D

B CA

WorkingProtection

B1B2

B2B1

Fig. 3-3 An OMS SPRing transporting two LSPs, a) before and b) after a failure in the link B-C (adapted from [Ar00])

Both directions of a bidirectional LSP are routed along the same side of the ring, in different fibers. Thus, the same wavelength can be reused to accommodate a


connection between other nodes, whose route does not overlap the existing connection (connections A-D and E-F in Fig. 3-3a).

When a link (or node) failure is detected at the OMS level, the nodes adjacent to the failure will loop back all LSPs at once on the protection channels of the ring (Fig. 3-3b). Similarly to the OMS DPRing scheme, an APS-like protocol is required to manage the switching actions and ensure the correct use of the shared protection capacity.

Although the implementation of OMS SPRing is more complex than that of OMS DPRing, it provides better bandwidth efficiency [VaPiDe04]. For example, the maximum number of protected LSPs that can be transported in an n nodes ring with OMS DPRing is limited to the number of wavelengths available in each link (hereafter W). On the contrary, using OMS SPRing the maximum number of protected LSPs which can be transported depends on the traffic pattern. Fig. 3-4 compares the maximum number of LSPs that can be transported using the OMS SPRing under three different traffic patterns, assuming W=40. It ranges from W for hub-like traffic (one node sources all traffic) to a maximum of Wn/2, for the case where the nodes send only traffic to their adjacent nodes.

4 3

1 2

4 3

1 2

4 3

1 2

4 3

1 2

4 3

1 2

4 3

1 2

Total Traffic:70 paths


a) Hub-likeTraffic Pattern

b) Full-MeshTraffic Pattern

c) AdjacentTraffic Pattern


20 paths 20 paths 20 paths

5 paths

20 paths10 paths

5 paths

10 p

ath

s

10 p

ath

s

15 p

ath

s 20 path

s

15 path

s 20 p

ath

s

Fig. 3-4 LSPs transported with the OMS SPRing scheme

As shown, the OMS SPRing scheme can transport, at least, the same number of protected LSPs than the OMS DPRing scheme. However, the OMS SPRing scheme can transport twice under the adjacent traffic pattern.

3.3.3 OMS protection in mesh networks

Mesh-based networks are extensively used in packet-based networks due to their high efficiency and flexibility. Nevertheless, circuit oriented transport networks have been traditionally designed as ring-based networks due to their inherently


fast protection switching capabilities and for providing high circuit availability. However, with the introduction of the p-cycles concept [Gr98] fast protection is also possible in mesh networks. p-Cycles concept can be applied to a wide range of technologies –such as WDM, SONET/SDH, or IP/MPLS networks– and protection schemes –such as path and link protection [Gr04].

A p-cycles is a mesh network with pre-connected closed cycles (p-cycle) defined on it. That means configure rings over mesh networks. Before the network is deployed, the planning phase determines which cycles optimize the network availability within the candidate cycles.

One p-cycle will include a subset of nodes in the network, and therefore several p-cycles can be defined. Links connecting nodes through the p-cycle are called on-cycle links while links connecting nodes in the p-cycle but which are not on-cycle, are called straddling links. Fig. 3-5a shows an example of a p-cycle network.

On-cycle links have differentiated working and protecting capacity, while straddling links have the double of the working capacity and not protecting capacity.

This kind of networks presents a better efficiency, in terms of protecting/working ratio, than ring networks. In fact, in ring networks one protecting capacity unit protects one working capacity unit, so this ratio is always 100%. In a p-cycles network this ratio is lower due to the fact that not only on-cycle links are protected but also straddling links are protected through the on-cycle links. For example, the network depicted in Fig. 3-5a has a protecting/working ratio of 8/18 = 44.4%.

a) b)

c)

on-cycle links straddling links

Fig. 3-5 a) A p-cycle network, b) the same network after a failure in an on-cycle link, and c) in a straddling link.

When an on-cycle link fails (Fig. 3-5b), its working traffic is protected concatenating the protecting capacity of the rest of on-cycle links. On the other hand, when a straddling link fails (Fig. 3-5c) two different concatenated routes can


be used to protect the link. This is the reason why straddling links can transport double of working traffic than on-cycle links.

An OMS p-cycles network is defined as a p-cycles network providing link protection. The dedicated OMS p-cycles scheme (hereafter OMS Dp-cycles) consists of links on one defined cycle and a number of straddling links (Fig. 3-6a). In the cycle, the whole capacity of one fiber is dedicated for working traffic while the whole capacity of second fiber is reserved for protection. In the straddling links, the whole capacity of both fibers is dedicated for working traffic.

A B C

F E D

A B C

F E D

A B C

F E D

A B C

F E D

b) OMS Sp-cycles

a) OMS Dp-cycles

Fig. 3-6 Examples of OMS Dp-cycles and OMS Sp-cycles networks.

Both directions of a bidirectional LSP are routed through the shortest path on the different sides of the cycle or through straddling links. Note than in this scheme, all links are unidirectional. Although straddling links have two working fibers, each fiber transports different LSPs. An example of this is shown in Fig. 3-6a, where each direction of a bidirectional path is routed through different straddling links.

When a failure occurs it is detected by the two optical nodes adjacent to the failure. Two cases can arise: if the failure is in an on-cycle link, both nodes loop back the working fiber, containing the affected multiplexed bundle of optical channels, on the protecting fiber in the on-cycle links; if the failure is in an straddling link, one working fiber is protected by the protecting fiber in one side of the cycle, while the second working fiber in the straddling link is protected by the protecting fiber on the other side of the cycle.

In the shared OMS p-cycles scheme (hereafter OMS Sp-cycles), working and protection capacities in the on-cycle links share each fiber. The total capacity of each fiber is divided in two wavebands. The whole capacity of straddling links is for


working traffic (Fig. 3-6b). Working connections in on-cycle and straddling links are protected by the available protecting capacity in the on-cycle fibers.

In this scheme all links are bidirectional, so both directions of a bidirectional LSP are routed through the same shortest route as shown in Fig. 3-6b. In this case links transport both directions of bidirectional LSPs.

When a failure occurs, it is detected by the two optical nodes adjacent to the failure. Both nodes loop back the affected multiplexed bundle of optical channels on the protection cycle in a similar way to the dedicated scheme, switching wavebands instead of fibers.

3.4 Path Protection Mechanisms

In the Dedicated-path Protection (DPP) scheme, a protection LSP is established over a dedicated disjoint alternate route to protect the working LSP. Traffic is simultaneously sent on both LSPs, and a selector is used at both ingress/egress nodes to receive the traffic.

If the graph representing the network is biconnected, the route of the working and the protection LSPs can be node-disjoint between origin and destination nodes (end-to-end protection), whereas if the graph is two-connected only link-disjoint routes can be found [Gr04].

However, in cases when either the network is not fully deployed in an area, or the utilization of some links in the network is high, or there are failed links, it is not possible to find a disjoint path pair. In such cases the best protection option is protecting everywhere possible (segment protection) [RFC-4873]. Fig. 3-7 illustrates both concepts. Note that in segment protection other nodes in addition to the end nodes perform protection switching (nodes B2 and B3 in Fig. 3-7b).

(b) Segment protection

(a) End-to-end protection

A6A4

A5

A1A2

A3

B4B3B1 B2

Fig. 3-7 End-to-end vs. segment protection


The Shared-path Protection (SPP) scheme is based on sharing protection resources among link-disjoint working LSPs. As an example, Fig. 3-8 shows two shared-protected LSPs where the working LSPs, w1 and w2, are strictly link-disjoint and the protection LSPs, p1 and p2, share the link 3-4.

1 2

3 4

5 6w2

w1

p1

p2

Fig. 3-8 Shared-path protection

3.5 State-of-the-art

There are some previous works in the literature regarding protection at optical layer which have to be taken into consideration.

Several works studying the availability of different protection schemes in ring and mesh networks can be found in the literature. The pioneering paper [ToNe94] analyzes some long-haul network architectures from an unavailability point of view. In that study a reference connection spanning 6,600Km with an availability objective of 99.92% (unavailability of 8 10-4) is proposed. Also failure rates and repair times, obtained from Bellcore, are presented.

The study in [Ma03] presents several network topologies for a pan-European optical transport network, which are compared in terms of the availability of the connections routed over them. Only two types of connections are considered, the unprotected and the 1+1 protected connection.

In [ToMa05-2] the authors provide algebraic equations to evaluate the availability of single optical connections under DPP and SPP in mesh networks. The same authors presented in [ToMa05-2] a design technique for reliable optical transport networks. It consists on dimensioning the network to carry a given set of static protected optical connections each one routed maximizing its availability.

The study in [DoCl03] shows how the service availability in SPP depends on the total capacity and the amount of sharing allowed in establishing the protection arrangements. Some methods for the analysis of the dual-failure restorability and related availability considerations in SPP as well as methods to optimize SPP


capacity requirements with explicit limits on the number of primary service paths that are allowed to share the same backup link are provided.

In [Ar00] different architectures for resilient ring and mesh based optical networks are described and qualitatively compared. Based on that study, Table 3-3 presents a comparison of different protection architectures.

Table 3-3 Comparison of protection architectures (adapted from [Ar00])

Link Cost

Node Cost Flexibility Availability Recovery Time

OMS DPRing Higher Lowest Low High Fast

OMS SPRing Lower Lowest Low High Fast

DPP (rings) High High High High Fast

SPP (rings) Mid Mid High High Slower

OMS p-cycles Mid Mid High Mid/High Slower

The optical transport network uses the physical infrastructure. This layer is composed by optical patch-panels, optical cables, conduits and other physical elements. Optical links may use common underlying infrastructure and thus fail simultaneously. A shared-risk link group (SRLG) defines a set of links having something in common. In [Gu05, Gu07], routing algorithms with SRLG disjoint protection for mesh networks are presented. In general, link failure dependency is an important factor to be considered when calculating disjoint routes. In practice, however, infrastructure information is not always available even within the same organization, due to historical and administrative reasons.

For the design of survivable networks, the planning phase should “create” the optical topology in a way that no common infrastructure (optical cables, conduits, etc.) is used by any two links in the network. In this thesis, that concept is assumed and no additional measures need to be applied in the rest of it.

Some interesting papers have shown experimental works about protection in rings. In [Li05] different optical protection ring architectures are described. In particular, OCh Shared Protection Rings (OCh-SPRing) are discussed in detail including node architecture designs. An APS-like protocol transmitted on a dedicated DWDM channel manages the protection on the ring. Note that this is not a control plane. An experimental demonstration of OCh SPring is shown. The architecture presented scales with the number of nodes in the ring, with the number of channels to be recovered and with the circumference length. As an example, recovery times within 50 ms can be obtained protecting 60 DWDM channels on a regional network with 12 nodes and a circumference of up to 1000Km.


In [Mu05] the authors propose a dedicated protection mechanism. It is based on extensions to RSVP-TE protocol for fault location and notification. The proposed mechanism can be applied to small metropolitan networks. Experimental tests on a three-node optical ring with a circumference of 35Km show recovery times of 45ms. Although the mechanism is lacking in terms of scalability, this work has novelty since it introduces a new functionality to the control plane: manage protection.

In this thesis we use concepts from both works. We use an APS-like protocol to mange the protection, and we use the control plane. However, we rely on the LMP protocol to implement that functionality at the control plane. Moreover, since the scalability of OCh SPRing depends on the number of connections to be protected, we implement the OMS protection which protects the complete DWDM-bundle with only one protection action.

An interesting research area for network operators is waveband switching (WBS) [Ca06]. In WBS, wavelengths are grouped into bands and switched as single entities. Thus, a waveband is an intermediate entity between fibers and wavelengths. Our solution for OMS SPRing is based on separating wavelengths into two bands, one for working and one for protection. When a failure occurs, working and protection bands are switched. Our solution for OMS DPRing, on the other hand, is based on fiber switching.

About LSP protection, DPP is typically applied to mesh networks. 1+1 DPP is very fast (on the order of milliseconds) [ApZa04], robust in front of multiple failures and requires a low degree of management complexity, but does not efficiently use the network resources.

SPP has been usually proposed to be implemented over mesh networks, as resource sharing for protection LSPs is only performed among link-disjoint working LSPs. In [Ou04] the problem of compute working and protection routes under shared-path protection constraints for a connection request was proved to be NP-complete. In [MaCa08] the authors present enhanced GMPLS routing approaches for both SPP and DPP recovery schemes that, besides maximizing resource usage, also address connection blocking mostly due to the wavelength continuity constraint.

Nonetheless, in Chapter 7 we propose to implement SPP in GMPLS-based DWDM rings, by using two different wavelengths. In rings the routing and wavelength assignment (RWA) problem can be computed in polynomial time. The routing and wavelength assignment problem appears in optical networks without or with limited wavelength conversion. It consists in finding a route between two end nodes under the wavelength continuity constraint. The RWA problem has been extensively covered in the literature, e. g. in the survey [ZaJu00].

In [RaSi95] a lower bound on the blocking probability for any RWA algorithm is derived. That bound has been used as a metric against which the performance of different RWA algorithms can be compared. Another simplest benchmarking method is by using the well-known First-Fit (FF) heuristic [ZaJu00]. It consists on


systematically consider each wavelength in sequence, always choosing the first wavelength where a route is feasible.

The current trend in networking is on the design of networks supporting several levels of quality of service (QoS). A survey of differentiation frameworks can be found in [ChMy07]. In [FuTa06] the concept of differentiated reliability (DiR) in optical rings is introduced; a reliability degree is assigned to each individual LSP, irrespectively of the underlying protection mechanism. In [ArKa03], the authors propose a framework denoted Quality of Reliability (QoRRT). In this framework, all connections have a dedicated protection path, and the recovery time is the time it takes to switch to the backup path. There are an infinite number of classes or grades which are determined on the basis of the maximum recovery time. In [BoKu01] a framework for differentiation based on recovery time requirements is introduced. The framework, denoted as Grade of Protection (GoP), is designed for mesh networks.

Our approach is closely related with the latter. In fact in Chapter 7 we implement two differentiated classes of protection providing service recovery within 50ms and within 100ms.

The authors in [Ge07] propose a protection scheme which records paths’ unavailability time. When a failed path approaches its Service Level Agreement (SLA) limit, it can pre-empt other paths to avoid violating the SLA. This scheme requires memory for path state information. Note that in networks with a distributed control plane, as it is the case for ASON/GMPLS networks, this per-path memory requirement involves additional processing and memory requirements, and the design of signaling extensions. In our approach, however, paths are assigned to a class of service. Paths belonging to protected classes can preempt paths belonging to the lower preemptable class of service. So that, no path state information is needed being thus the overhead introduced by our approach much lower than that of [Ge07].

3.6 Summary

In this chapter some useful terms related with resiliency have been presented. Firstly, availability and equations to compute system’s availability have been introduced. Moreover, typical figures for failure rates and repair times in optical networks have also been shown. Then, SLA has been defined as a contract between the client layer and the optical layer.

Then, specific recovery mechanisms have been presented and classified in terms of their granularity (link and LSP or OCh) and by the use of resources (dedicated and shared). Shared mechanisms typically provide better resource efficiency than the dedicated mechanisms. However, their implementation it is usually more complex.


Recovery schemes can be applied to ring and mesh networks. Although mesh networks are more efficient and flexible than ring-based networks, network operators typically rely on ring-based networks due to their inherently fast protection switching capabilities and for providing high LSP availability. However, the p-cycles concept provides ring speed with mesh efficiency; applying ring techniques over mesh networks is a key concept in emerging optical core networks.

Finally, the last section presents a survey about the state-of-the-art of the technologies used in this thesis.

3

31

Chapter 4

Availability Studies

Availability was defined in Chapter 3. In this chapter we define mathematical (algebraic) models to compute the LSP availability for the protection schemes presented in the previous chapter. Firstly, we focus on modeling the LSP availability for the OMS protection, both in rings and in mesh networks. Moreover, we exemplify why a protection scheme is strictly required. Then, LSP availability is modeled under path protection schemes in rings. Finally, at the end of the chapter, the protection schemes are compared in terms of the expected LSP availability. Although the obtained mathematical models can be used to calculate the expected connection availability under any arbitrary number of failures, in general, protection schemes are designed to deal with single failures.

In the following sections we use the set definitions presented in Table 4-1.

Table 4-1 Set of links definition

R Set of all links in the ring

L Set of links transporting a particular LSP

P Set of all links in the p-cycles mesh network

C Set of on-cycle links (p-cycles)

S Set of straddling links (p-cycles)

Sets R and L will be used in ring-based recovery schemes, whereas P, C, and S sets are used in p-cycles mesh networks. Note that, from the previous definitions and from the introduction to p-cycles in Chapter 3, P≡CUS, with C∩S=Ø.


4.1 OMS protection schemes in rings

In this section we define an availability model for LSPs under the OMS DPRing and the OMS SPRing schemes defined in Chapter 3.

In an OMS DPRing, the LSP availability is given by the union of two disjoint groups of events, namely: 1) all links i in the ring are available and 2) one link in the ring is unavailable, while the rest of the links are available and can be used for ring protection. In this scheme, LSPs use resources in every link of the ring as shown in Fig. 3-2 and then, every LSP presents the same availability, and its value is given by the following expression:

Ri

Rjiij

jiRi

iDPRing AUAA

,

* (4.1)

As an example, we calculate the availability for the LSP A-D in the OMS DPRing network shown in Fig. 3-2a. For the sake of simplicity we consider that all the links have the same length (300 km). Then, using the values given in Table 3-1, the availability will be:

99.9981%6 56 linklinklink

DADPRing AUAA (4.2)

Availability figures close to 100% are difficult to compare. For this reason, we will use the unavailability figure. Applying equation (3.2), the LSP A-D unavailability is:

510881 -DADPRing .U (4.3)

This means that the LSP A-D will be unavailable, in average, 9.86 minutes/year under the OMS DPRing scheme.

We can use equation (4.1) to calculate also the LSP availability in OMS SPRing, taking into account that in this scheme the network is bidirectional and LSPs will be routed strictly through the shortest route. Therefore, in OMS SPRing each LSP has a different availability depending on its route. According to this, we can express the LSP availability over an OMS SPRing, as:

Li

ijRj

jiLi

iSPRing AUAA * (4.4)

In this case, the unavailability for the LSP A-D under the OMS SPRing scheme shown in Fig. 3-3a, is:

553 1050131 -linklinklink

DASPRing .)AU(AU (4.5)

This means that the LSP A-D will be unavailable, in average, 7.89 minutes/year over this OMS SPRing.

Chapter 4 – Availability Studies 33

On the basis of the previous results, we can conclude that OMS shared protection provides better LSP availability than OMS dedicated protection, since in OMS shared protection it is possible to find shortest routes for the LSPs.

Fig. 4-1 shows the unavailability for the longest possible LSP in an OMS SPRing, (solid lines) and in an OMS DPRing (dashed lines), as a function of the number of nodes (n) in the ring for several average link lengths (L) in a metropolitan environment.

1 E-09

1 E-08

1 E-07

1 E-06

1 E-05

4 6 8 10 12 14 16 18 20

Exp

ecte

d U

nav

aila

bil

ity

(U)

Number of nodes in the ring (n)

U(SPRing) (L=10 Km) U(DPRing) (L=10 Km)



Fig. 4-1 LSP unavailability in OMS schemes in metropolitan rings

The target LSP availability in a network has to be chosen according to the distances in that network. In metropolitan networks, an availability objective of 0.99999 (five nines) or 5.26 minutes/year of total outage is sometimes referred as the availability objective. However, in long-haul core networks a four nines availability objective (less than 53 minutes/year of total outage), can be considered more appropriate. Note that with the values in Table 3-1, Ulink ranges from 10-4 to 10-3 for lengths ranging from 30 to 300 km.

1 E-06

1 E-05

1 E-04

1 E-03

4 6 8 10 12 14 16 18 20

Exp

ecte

d U

nav

aila

bil

ity

(U)





U lim

Fig. 4-2 LSP unavailability in OMS schemes in long-haul rings


Fig. 4-2 shows the LSP availability in long-haul networks. Consequently, the graph in Fig. 4-2 also draws the unavailability objective of 10-4, which corresponds to a target availability of 0.9999. We can conclude that the maximum overall length which allows meeting the strict unavailability objective in rings with L=300km is about 3,900 km for the OMS DPRing scheme, and about 4,500 km for the OMS SPRing scheme. Comparing both of them, the OMS SPRing scheme provides an improvement of about 25% in the expected connections unavailability over the OMS DPRing scheme.

If no protection scheme is implemented and applying equation (3.6) to a ring network whose length is 3,800 Km, an unavailability of 1.4 10-2 would be found. This implies more than 5 days/year (or 20 minutes/day) of total outage, which clearly does not match with the required network availability. Therefore, a recovery scheme is strictly required.

Finally, let us analyze the behavior of OMS schemes in a multiple failure scenario. In OMS DPRing, all LSPs will become unavailable under a double-link failure since they have the same route through all links in the ring. This is different in OMS SPRing, since each LSP may have a different route. One LSP will remain working if, under a double-link failure, the failures affect to links which do not support that LSP; in all other cases the LSP will become unavailable. As an example, let us consider the OMS SPRing in Fig. 4-3, where links A-F and F-E fail simultaneously. In this case, LSP F-E will become unavailable while LSP A-D will remain working. Therefore equations (4.1) and (4.4) can be used to calculate the expected LSP availability under any arbitrary number of failures.

F E D

B CA

F E D

B CA

WorkingProtection

B1B2

B2B1

Fig. 4-3 Double link failure in a OMS SPRing

4.2 OMS protection schemes in mesh networks

In OMS Dp-cycles mesh networks, LSPs availability is given by the union of three disjoint groups of events: all links iL are available; one link in L is unavailable, but the rest of the links jP are available –links in C can be used for protection and links in S are not being protected; and, two links are unavailable, one in S and


in L and one in S but not in L and the link in L became unavailable first, the rest of the links jP are available. This can be expressed as:

SLt LSu

utjPj

jutLi

ijPj

jiLi

icyclesDp EUUEEEPEP)\(

,

**21

* (4.6)

Considering the links in the network as been mutually failure-independent, LSPs availability over OMS Dp-cycles mesh networks can be expressed as:

SLt LSuutjPj

jutLi

ijPj

jiLi

icyclesDp AUUAUAA)\(

,

**21

* (4.7)

As an example, we will calculate the availability for the LSP A-D in the OMS Dp-cycles network shown in Fig. 3-6a, considering that all links are of the same length (300 km). Then, using the values given in Table 3-1, the availability will be:

-57284 103.122 22141 linklinklinklinklinkDAcyclesDp A*U*/AUAU (4.8)

Recall that the LSP unavailability under the OMS DPRing scheme with also six nodes is 1.88 10-5, as computed in equation (4.3). This is equivalent to say that the LSP A-D will be unavailable, in average, 16.41 minutes/year over an OMS Dp-cycles network or 9.86 minutes/year if we use the OMS DPRing scheme.

In this case, the unavailability is higher than in the ring networks case. This is due to the fact that all links in the network affect the LSP availability and the mesh network has three additional straddling links with respect to the ring network. To generalize this, Fig. 4-4 shows the effect of increasing the number of straddling links over the unavailability of the longest possible LSP in OMS Dp-cycles networks with n nodes.

5 E-05

5 E-04

10 11 12 13 14 15 16 17 18 19 20

U(std=1)U(std=2)U(std=3)U(std=4)U(std=5)

Number of nodes (n)

Exp

ecte

d U

nav

aila

bil

ity

L=300km

Fig. 4-4 Increasing LSPs unavailability by adding straddling links.


However, the existence of straddling links provides, in general, shortest routes, counteracting in such a way its contribution to the higher LSP unavailability.

In Fig. 4-5 a comparison between LSP unavailability in OMS DPRing and OMS Dp-cycles networks is shown. In the former, all LSPs are routed through the complete ring and thus, all will present exactly the same unavailability. In the latter, LSPs can be routed through a number on links, which is in general, lower than the previous case. If the end nodes are directly connected through a straddling link, the unavailability will be much better than in the OMS DPRing.

On the other hand, if the LSP is routed through the largest possible route, its unavailability will be slightly worse than in the OMS DPRing. However, in a well planned mesh network routes should be much shorter than in ring networks.

1 E-05

1 E-04

1 E-03

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

U(Dp-cyles)

U(DPRing)

Number of nodes (n)

Exp

ecte

d U

nav

aila

bil

ity

Fig. 4-5 Comparing LSP unavailability in OMS Dp-cycles and in OMS DPRing networks.

We can use equation (4.7) to calculate also the LSP availability in OMS Sp-cycles mesh networks, taking into account that, in this case, the network is bidirectional and LSPs will be routed strictly through the shortest route. In this case, the unavailability for the LSP A-D in the OMS Sp-cycles network shown in Fig. 3-6b is:

57282 10749122121 -linklinklinklinklink

cyclesSpDA .)A*U*/AU(AU

(4.9)

This means that the LSP A-D will be unavailable, in average, 9.25 minutes/year over this OMS Sp-cycles network.

On the basis of the previous results, OMS shared protection provides better LSP availability than OMS dedicated protection, since in OMS shared protection it is possible to find shortest routes for the LSPs.

Fig. 4-6 shows a comparison of LSP unavailability between OMS Dp-cycles and OMS Sp-cycles as a function of the number of nodes in the network. The effect of adding straddling links has been studied above. Here, in order to compare both cases, we assume three straddling links in the network.


In OMS p-cycles, the unavailability of a LSP will be some value between the unavailability of the pure straddling LSP, the minimum, and the unavailability of a LSP routed through the longest possible path, the maximum. In the case of the pure straddling LSP, the LSP unavailability is the same in OMS dedicated and in OMS shared schemes. On the contrary, the worse unavailability is in the case of OMS dedicated scheme, as discussed previously.

1 E-05

1 E-04

1 E-03

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

U(Dp-cyles)max

U(Sp-cycles)max

U(p-cyles)min

Number of nodes (n)

Exp

ecte

d U

nav

aila

bil

ity

Fig. 4-6 Unavailability of LSPs in OMS p-cycles networks.

Finally, the contribution of the term Ulink2 to the LSP unavailability is small (≈ 5%). If we do not consider this term to calculate the LSP unavailability we are assuming that it will become unavailable when two links will be unavailable, one of them affecting the LSP. Therefore, we can consider (4.10) as being the upper bound of LSP unavailability.

Li

ijPj

jiLi

icyclesp AUAA * (4.10)

4.3 Path protection

In this section we provide mathematical models to calculate the LSP availability under path protection in rings. These models can not be applied to mesh networks since specific assumptions are done as explained in the following.

The availability of a LSP using the SPP scheme (SP) is given by the union of two disjoint groups of events, namely: 1) all links i in the working LSP are available and 2) one link in the working LSP is unavailable, while the links of the protection LSP are available and can be used for protection. Thus, the availability can be expressed as:


Li

ijLRj

jiLi

iSP AUAA\

(4.11)

Note that equation (4.11) models the availability for a protected LSP without any resource sharing (i.e. DPP). Let us analyze the behavior of SPP in a multiple failure scenario. As an example, let us consider the ring in Fig. 6-1b. Three cases can be distinguished: 1) Two links transporting one working LSP fail simultaneously; the working LSP will be protected using the protection LSP. For example, if links 5-6 and 6-7 fail simultaneously, the LSP 1 continues working using the protection LSP p1. The LSP 2 is not affected by the failure. 2) Two links, one transporting the working LSP and the other transporting the protection LSP of the same protected LSP, fail simultaneously; the protected LSP is cut. For example, if links 5-6 and 1-7 fail simultaneously, the LSP 1 is cut. The LSP 2 is not affected by the failure. 3) Exactly the same as 2) but the link transporting the protection LSP transports also the working LSP of another SP LSP; both LSPs are cut. For example, if links 5-6 and 2-3 fail simultaneously, the LSP 1 and the LSP 2 are cut. This is exactly the same behavior as DPP where both LSPs have been routed using different wavelengths as in Fig. 6-1a. Therefore, we can conclude that shared-path protection in rings presents the same availability as dedicated protection, showing better resource usage ratio. This may be not true in mesh networks due to the resource sharing.

On the other hand, a best-effort (BE) LSP (as defined in Chapter 3) will be available when all links i in L are available, but will be preempted when one link not in L is unavailable while the rest of the links not in L are available and can be used for the protection of an SP LSP. This is equivalent to say that a BE LSP will be available when all links i in the ring are available.

Ri

iBE AA (4.12)

Note that, in the case of DPP, the BE LSPs are torn-down when the associated protected LSP is torn-down. These forced torn-downs provide additional unavailability to BE LSPs under the DPP scheme. In the SPP scheme no forced torn-downs are needed, and thus (4.12) can be used to compute the BE LSP availability.

Fig. 4-7 shows the unavailability for the longest possible routes for both LSP classes as a function of the number of nodes in the ring. The unprotected LSP availability is also plotted for reference. We assume that the average link length is 300km. We can observe how preemption over BE LSPs leads to unavailability values which double those of unprotected LSPs. Obviously, SP LSPs presents the best availability, which is provided by the protection.


1 E-06

1 E-05

1 E-04

1 E-03

1 E-02

1 E-01

4 6 8 10 12 14 16

U (SP/DP LSP)

U (BE LSP)

U (UP LSP)


Expected Unavailability (U)

Fig. 4-7 LSP unavailability.

4.4 Summary

The LSP availability is an important comparison criterion among different protection strategies. We have studied the LSP availability provided by protection schemes, both in ring and in mesh. Expressions to calculate LSPs availability has been described for the different schemes.

In rings, OMS shared protection provides better LSP availability than OMS dedicated protection, since in OMS SPRing it is possible to find shortest routes for the LSPs.

In mesh networks, we have studied the LSP availability under dedicated and shared OMS protection schemes. Also a model to calculate LSPs availability has been described for both schemes. The model can be condensed as (4.16), which is the upper bound for mesh-based OMS p-cycles schemes. The OMS Sp-cycles provides better LSPs availability than OMS Dp-cycles due to the fact that the LSPs are routed in the former through a shorter route than in the latter. Moreover, straddling links add an extra unavailability to LSPs over mesh networks compared with LSPs with the same number of hops in ring networks. However, the existence of straddling links provides, in general, shortest routes, counteracting the previous statement.

Besides, algebraic expressions to calculate the LSP availability under path protection in rings have been built. In this scenario, shared-path protection and dedicated protection present the same LSP availability.

Finally, Table 4-2 presents a comparison of the different protection schemes described, in terms of the expected LSP availability.


Table 4-2 Protection schemes availability comparison

Protection Scheme LSP Availability

OMS DPRing Middle, lower than OMS SPRing.

OMS SPRing High

OMS Dp-cycles Lowest

OMS Sp-cycles Higher than OMS SPRing, considering average route lengths.

DPP (rings) Highest

SPP (rings) The same as DPP in rings

In the following chapters we implement protection schemes at both the OMS and the path layers, in rings. There, all protection schemes analyzed present the same availability. However, as a consequence of its different characteristics, the applicability of every scheme is different, as will be demonstrated.

3

41

Chapter 5

The CARISMA Network Test-Bed

The CARISMA network test-bed has been implemented to be used as a multi-domain field-trial for the integration and evaluation of the ASON/GMPLS technologies. Fig. 5-1 presents the architecture of the CARISMA network test-bed. It contains the following three functional planes:

Transport plane, responsible for traffic transport and switching.

Control plane, responsible for connection and resource management. It can be either associated with (in-fiber) or separated from (out-of-fiber) the managed transport network.

Management plane, responsible for management of the whole system (including transport and control planes). It triggers commands to the control plane to set-up and tear-down soft-permanent connections. At the management plane of the ASON/GMPLS CARISMA network test-bed, the Network Management System (NMS) was implemented as a web-based application [EsFi05], easing network management through the Internet.

CARISMA is a generic denomination which jointly denotes a set of applications, technologies, and infrastructures (see for example [EsFi05, EsSp05, PeEs07, GCOICTON07]). This chapter strictly presents the work done within this thesis.

At the transport plane, two alternative nodes are presented: the physical node and the emulated node. The functional and physical design of a semi-Reconfigurable OADM is described. Moreover, the interface designed to convey management information between the optical nodes at the transport plane and the management or control plane is presented.

At the control plane, every OCC contains three modules which communicate among them. Finally, our request generator, a tool to allow performance comparison in


terms of blocking probability among the solutions proposed in the following in this thesis, is also presented.

Transport Plane

Control Plane

ManagementPlane

OCC

Connection Controller (CC)

Link Resource Manager (LRM)

Routing Controller (RC)

Fig. 5-1 The CARISMA network test-bed

5.1 The Transport Plane

The CARISMA transport plane uses both physical and emulated optical nodes. Using the physical nodes, called semi-reconfigurable OADMs (sROADM), a unidirectional ring can be build. However, in order to perform tests over more complex networks, including mesh networks, OXC node emulators have been developed. Both, the sROADM and the OXC emulators provide the same CCI interface making the concrete configuration of the transport plane transparent to the control plane.

5.1.1 sROADM design

The functional design for the Semi-Reconfigurable Optical Add Drop Multiplexer (sROADM) is shown in Fig. 5-2. This optical node allows dropping two wavelengths out of the bundle of DWDM-multiplexed signals and adding two new wavelengths to the DWDM-multiplexed bundle. In order to simplify subsequent figures, the schematic icon shown at the bottom in Fig. 5-2 will be used.

The monitoring device extracts a small part of the incoming optical power, transforms the sample into a digital value by means of an A/D converter, and stores the converted value in a register. The monitoring sweep time is 10 μs.

Chapter 5 – The CARISMA Network Test-Bed 43

Two demultiplexer/multiplexer modules from Bookham [Book] have been used in order to extract/insert 8 contiguous DWDM 100GHz separated wavelengths. Although 8 wavelengths are extracted, the optical node is only able to use only 4 of these wavelengths, due to economical reasons. The mux/demux module has insertion losses lower than 3.5 dB. All remaining channels are reflected onto the express port. The express port can be used to connect a full Reconfigurable OADM module.

Fig. 5-2. sROADM functional design

The sROADM node is equipped with a 4x2 optical switch fabric, in order to select the wavelengths to drop. The 4x2 optical switch has been built using the simplest 2x2 and 2x1 optical switches. The switching device, from Sercalo [Sercalo], has a very fast response (tswitch) time below 1ms and has insertion losses lower than 0.9 dB.

The two dropped wavelengths are adapted in the transponder module providing the interface to client signals.

In the transmission direction, the sROADM node can add two wavelengths to be multiplexed by the optical multiplexer into a DWDM-multiplexed bundle. Additionally, two splitters allow performing multicast.

Using these unidirectional sROADM nodes, it will be possible to deploy unidirectional ring networks, as shown in Fig. 5-3. As fixed lasers will be used, each node of the ring will have unique behavior.

8

7

1

2

4

3

5

6

8

7

6

5

4

3

2

1

λ5 λ6

IN OUT

12

34

65

34

IN OUT

(WEST Rx) (EAST Tx)

M

express-out express-in


Each node is able to drop 2 incoming wavelengths from the ring and can add 2 exclusive wavelengths to the ring. It is possible to choose among 4 different incoming wavelengths in order to drop 2 of them. Two incoming wavelengths are selected in a fixed way and 2 wavelengths more in a flexible way by using the splitters.

Fig. 5-3. Unidirectional ring with three sROADMs

Using the defined sROADM design, a three-node unidirectional ring can be built. First, a node inserts an exclusive wavelength into the ring. The second and third nodes in the ring can drop this wavelength using the optical switches.

Both flows of a bidirectional LSP are routed on the sides of the ring, using different wavelengths. There is no possibility to reuse wavelengths on the ring due to the internal architecture of the nodes. As the nodes are equipped with two transponders belonging to a collection of 6 different wavelengths, the maximum capacity that can be allocated on a three nodes ring is limited to three bidirectional channels.

Depending on the traffic pattern to be transported, this kind of unidirectional ring will support a different number of bidirectional channels. Let us analyze two different traffic patterns. The first one represents one bidirectional communication between adjacent nodes in the ring and it is able to transport up to 3 bidirectional optical channels, the maximum traffic capacity. The second traffic pattern represents two bidirectional communications between two nodes in the ring and it is able to transport only up to 2 bidirectional optical channels. Fig. 5-4shows both traffic patterns for bidirectional communications.

65

21

124

3

1

2

34

6

53

4

65

43

65

21

Node C

Node A

Node B


Fig. 5-4. Traffic patterns

The functional design of the 2x2.5 Gbit/s transponder card is shown in Fig. 5-5. It allows transforming two 2.5 Gbit/s client signals into fixed DWDM wavelengths to be added to a DWDM link.

Fig. 5-5. 2x2.5 Gbit/s Transponder card

The transponder card integrates several modules:

SFP client transceiver. It transforms a client 2.5 Gbit/s optical signal into an electrical signal to be processed and vice versa. It supports up to OC-48 SONET/STM-16 SDH, G.709 FEC and GbE. The speed and protocol are selected through software.

DWDM SFP. It transforms a fixed DWDM 2.5Gbps wavelength into an electrical signal to be processed and vice versa.

PM module: It monitors for error in the link in a non-disruptive way. This module supports SONET/SDH or Ethernet Performance monitoring in both

A->B λ1B->A λ4

A->C λ2C->A λ5

B->C λ3C->B λ6

A->B λ1B->A λ4

A->C λ2C->A λ5

B->C λ3C->B λ6

A->B λ1B->A λ4

A->B λ2B->A λ3

A->B λ1B->A λ4

A->B λ2B->A λ3

1234

65

34

6 52 1

1243

6543

65

21

Node C

Node A

Node B

1234

65

34

6 52 1

1243

6543

65

21

Node C

Node A

Node B

SR SFP PM PM DWDM SFP


Client I/F’sNetwork I/F’s


receiving and transmitting paths. All necessary monitor functions for SONET/SDH and GbE are provided for OAM and Provisioning.

Integrated switch: Allows reconfiguration and to perform loopbacks to the client and to the line. Loopbacks are used to test end-to-end signal continuity for the link and for the client.

Integrated Booster Amplifier: Can be optionally placed at the nodes. It is possible to choose into amplify in a per wavelength basis or in a per link basis. If a per wavelength basis is chosen, booster amplifiers are integrated in the transponder card as shown in Fig. 5-5.

The transponder card supports both Add/Drop and Regeneration as shown in Fig. 5-6.







Single 2,5Gbit/s regenerator

2,5Gbit/s transponder for LSP protection

Dual 2,5Gbit/s transponderfor unprotected LSPs

Fig. 5-6. Different uses of the transponder card

The sROADM uses a 100 Mbit/s Ethernet signal over the OSC, to convey the control plane information when the in-fiber out-of-band option (described in the next section) is used. Moreover, specific mux/demux components are equipped in order to attach/detach the OSC 1310nm optical signal to/from the DWDM-bundle.

The physical sROADM will be allocated in a chassis designed to be placed in a standardized 19-inch rack. In order to do so, the functionality depicted in Fig. 5-2 has been separated into several building blocks. Each building block has been conceived to be physically implemented as a separated plug-in card, as shown in Fig. 5-7.

Three different blocks, or cards, have been defined: the Mux card, the Transponder card and the Optical Switching and Monitoring (OSNL) card. Fig. 5-8 shows the physical layout of the resulting OSNL and Master cards and the test-bed where the complete architecture is being tested.


Fig. 5-7 sROADM physical layout and building blocks

Fig. 5-8 Physical layout of OSNL, Transponder and Master cards and test-bed.

2xT

p

Mas

ter

OS

NL

OS

NL

1 2 3 4 5 6 7 8 10 11 12 M9

2xM

x8c

8

7

1

2

4

3

5

6

8

7

6

5

4

3

2

1

2xMx8c

2xTp

OSNL

M

M

λa λb

client client

2xTp

OSNLMaster


5.1.2 OXC emulator

As introduced previously, OXC emulators are needed to build complex topologies. Moreover, OXC emulators allow for testing new optical nodes architectures and network solutions. Regarding this, we use OXC emulators to measure times related to protection/restoration. Besides the CCI interface, the OXC emulator implements an additional interface to receive failure information related with the equipped ports.

The OXC emulator provides configurable delays when implementing SNCs. Upon reception of a command through the CCI, the request is stored in a first-input first-output request queue. A connection processor is continuously retrieving commands from the queue and executes the command introducing the specified delay, emulating the switching time (tswitch) of a physical device.

5.1.3 Management interfaces

The standard management framework currently used is the Simple Network Management Protocol (SNMP) [RFC-3411]. It was introduced in the late 1980s and is widely supported by network devices. SNMP is a special-purpose management protocol that can be used to read and write simple typed variables. The software component that handles the associated Get/Set requests and accesses the internal data structures on managed devices is called an agent. In addition to processing such requests, an agent can also generate notifications under certain circumstances and send them as unsolicited messages to the management application (manager). This architecture is known as the manager-agent paradigm. Concrete data models for managing specific technologies or protocols are defined and standardized in management information base (MIB) modules, which are written in a language based on Abstract Syntax Notation 1 (ASN.1) [X.680].

However, SNMP has been used mostly in monitoring for fault and performance management, but has been hardly used for configuration management due to its limitations [ScPr03]. For example, the object identifier (OID), which is a naming mechanism of SNMP, is so simple and verbose that it is very inefficient in usage and implementation. Moreover, configuration tasks require several high-level management operations such as download, activation, rollback, and restoration. The SNMP Set operation can be used to realize such operations as side-effects, but it makes management applications very complicated. Therefore, with SNMP it is difficult to support various operations such as to load/restore configuration, activate a new configuration at a specific time, and roll back a configuration [NetConf]. Finally, UDP is the preferred transport of SNMP for IPv4. The size of SNMP over UDP messages is usually limited by the size of the maximum transmission unit (MTU), which is insufficient for bulk configuration data transfers.


To overcome the shortcomings of SNMP, Extensible Markup Language (XML) technology can be used for configuration management. We have integrated a XML-based agent in our sROADM, implementing a proprietary XML-based protocol. The protocol is connection oriented, requiring a persistent connection between the manager and the agent. This connection provides reliable and sequential data delivery. An example of the XML command to perform a new optical connection is shown in Table 5-1.

Additionally, in order to provide a standard SNMP management interface, a module in charge of translating SNMP Get/Set request into XML commands and vice versa, has also been provided.

Table 5-1 Example of XML command

<cci reqid="5"> <snc id="0x10c10c5a" command="create">

<swtype>lsc</swtype> <bw>2488</bw> <bidir>1</bidir> <cp>

<type>input</type> <port>0.1.10.1</port> <wavelength>4</wavelength>

</cp> <cp>

<type>output</type> <port>0.1.9.1</port> <wavelength>1</wavelength>

</cp> <cp>

<type>input</type> <port>0.1.9.1</port> <wavelength>1</wavelength>

</cp> <cp>

<type>output</type> <port>0.1.10.1</port> <wavelength>4</wavelength>

</cp> </snc>

</cci>

5.2 The Control Plane

The CARISMA GMPLS control plane uses the RSVP-TE protocol for signaling, the OSPF-TE protocol for routing, and the LMP protocol for control channel management and link property correlation. The OCCs have been implemented


using Linux-based routers. Each pair of OCCs communicates through a single IP control channel implemented with full duplex Fast Ethernet links. Finally, each OCC communicates with the local OXC through the CCI.

The OCC contains three modules which communicate among them, as shown in Fig. 5-1: the Link Resource Manager (LRM), the Routing Controller (RC), and the Connection Controller (CC).

Link Resource Manager

LRM Module

OXC Manager

CC

RC

CCI

LMP Module

MsgFSM

ConfigurationAlarms Notification

MIB

LMP

LRM Server

Fig. 5-9 The Link Resource Manager

The LRM module (Fig. 5-9) is responsible for the management of the resources available at the optical node. Note however that, in the extended version of the LRM available in the CARISMA control plane, it manages also control plane resources and implements additional functionalities such as control channel management, resource discovery, control plane recovery, etc. The transport plane resources state is stored in the MIB, and the OXC Manager module synchronizes the state of those resources with the optical node through the CCI. The same interface is used by the node to notify alarms using SNMP traps.

Whenever the node used, physical or emulated, the view of that node at the control plane is represented in Fig. 5-10 which follows the network model described in Chapter 2. A node contains a set of ports which are connected to TE-links in a one-to-one relationship. Every port contains a set of CPs and every TE-link contains a set of data-links. CPs and data-links are associated and represent every wavelength in the DWDM-bundle. Note that CPs and data-links are unidirectional, while TE-links and ports can be bidirectional when encompassing different input and output data-links/CPs.


DLink1

DLink2

DLinkn

DLink1

DLink2

DLinkn

TELink

CP1

CP2

CPn

CP1

CP2

CPn

Port

OXC

Port1

Port2

Port3 Port4

Port6

Port5 TELink5

TELink6

TELink3 TELink4

TELink2

TELink1

Fig. 5-10 The OXC Model: TE-links, data-links, and CPs

An LSP is a concatenation of connections (SNCs) performed in every node through the LSPs route between the source and destination nodes. Thus, the main objective of an OXC is to connect input and output optical signals. In this regard, Fig. 5-11 and Fig. 5-12 represent the model for unprotected and protected SNC respectively. In Fig. 5-11 the unprotected SNC connects two pairs of CPs. Every pair of CPs belongs to different ports in the OXC. Two general types of unprotected SNCs can be distinguished: the passthrough and the add&drop.

SNC

CPi1

TELink1

iDLink1

oDLink1

Port1

TELink2

oDLink2

iDLink2

Port2

CPo1

CPo2

CPi2

Add&Dropconnection

Passthroughconnection

Fig. 5-11 Unprotected SNC model

In Fig. 5-12 three pairs of CPs are connected in a protected relationship. One pair of CPs represents the working signal, another pair the protection signal, and finally the third pair represents the protected signal. Before connect the incoming signal at the input working or protection CPs, a selector chooses which signal is connected. In the event of failure input CPs change their state to unavailable. At this moment, the selector can change the signal chosen from the working to the protection or vice versa. In the output direction a splitter is used to duplicate the signal sending it to both the working and the protection output CPs.


SNC

TELink2

oDLink2

iDLink2

Port2

CPo2

CPi2

CPi1a

TELink1a

iDLink1

oDLink1CPo1a

CPi1b

iDLink1

oDLink1CPo1b

TELink1b

Port1a

Port1b

Working Protection

Working Protection

Fig. 5-12 Protected SNC model

Moreover, the LRM contains the LMP module. In this thesis, the LMP module only implements the LMP messages related to Fault Management. Finally, the LRM server module implements several communication interfaces with the RC and CC modules belonging to the same OCC.

The RC (Fig. 5-13) is responsible for route computation. It implements several routing algorithms to compute control plane routes. In this regard, it is worth noting that the CARISMA network test-bed uses differentiate addressing spaces at the control plane and at the transport plane. In fact, the quagga OSPF module [QUAGGA] implements the OSPF protocol which floods links state information related with the control plane IP network. Nonetheless, the RC floods the state of the local outgoing data-links, using OSPF-TE Opaque Link State Advertisements (OLSAs) [RFC-3630, RFC-4203] to its control plane neighbor OCCs. The information in the OLSAs is related with the transport plane and it is stored in the TE database. OLSA flooding is performed every time a data-link is used by a LSP or is released.

The RC module implements communication interfaces with the CC and with the LRM. The CC requires route computation between two end nodes, whereas the LRM notifies the RC about the whether a local resource has been used or released and about failures. That state changes imply OLSA flooding.

Finally, the CC (Fig. 5-14) is responsible for LSP set-up and tear-down. The CC module includes the RSVP module which implements the RSVP-TE protocol. The CC contains the PSB database which stores every LSP already establish using resources at the local optical node. The NMS communicates with the CC through the NMI-A interface to request set-up or tear-down connections. Upon the reception of a set-up command between two end nodes, the CC asks the RC for a route at the transport plane. Moreover, every CC in the route of a setting-up LSP must ask the LRM about the availability of the local resources and request it to allocate those resources.


Routing Controller

RC Module

quagga OSPF

CC

LRM

OSPF‐TE

MsgFSM

TEDB

RC Server

OSPF API

OLSA

Algorithms

LSDB

Fig. 5-13 The Routing Controller

Connection Controller

CC Module

RSVP Module

NMI‐A

RC

RSV

P‐TE

MsgFSM

LRM

PSB

CC Server

CC_RC

CC_LRM

LRM

Fig. 5-14 The Connection Controller Architecture

5.3 The Request Generator

In order to do performance comparison in terms of blocking probability among different solutions, in this thesis we have developed a connection requests


generator. This tool generates connection requests to evaluate the performance of different protection strategies as well as to carry out comparison among them.

Traffic is modeled using the approach of Dwivedi and Wagner [DwWa00]. This model differentiates between three traffic types: voice traffic, transactions data traffic (business IP traffic) and Internet traffic (IP traffic not related to business). The resulting total traffic between locations A and Z is derived as the sum of the previous component patterns.

While telephonic (voice) traffic is mainly exchanged between locations that are geographically close, the exchange of internet traffic is much less related to the distance.

According to [Ma03] telephonic traffic intensity is inversely proportional to the distance between origin and destination (DA-Z), transactions traffic intensity is inversely proportional to the square root of the distance, and Internet traffic is independent of the distance.

To calculate the destination of a connection request arriving to a source node two random numbers are generated. The first one defines the type of traffic and the second one is applied over an ordered array containing the list of all nodes in the topology to obtain the destination node. The array is ordered according with the type of traffic selected in the first step.

Two types of events have been considered in the request generator here presented:

Set up LSP: Given a demand (source and destination nodes, duration, bandwidth, and class of service), try to establish (determine the route and reserve the resources) a new LSP in the network taking into account the current links occupation (and the failure state of the links if it is the case). If successful, schedule the tear-down event of the LSP.

Tear-down LSP: Release the LSP resources.

Connections are requested to each OCC according to a Poisson process with a predefined mean inter-arrival time (iat). The connections holding time is exponentially distributed with a predefined mean (ht). The destination of each connection request is defined by the mix of traffic patterns described above. The average traffic intensity in Erlangs departing each node is therefore:

E=ht/iat (5.1)

This tool supports the generation of traffic associated to four defined classes, described in Table 5-2. Every class is described using a mix of traffic patterns defined above, the traffic intensity in Erlangs, and the holding time. These classes of traffic will be used in the following chapters of this thesis.


Table 5-2 Classes of traffic Definition

Class Description

SP Shared Protected class

DP Dedicated protected class.

UP Unprotected class.

BE Best-effort class.

Preempted when resources are needed for protection of the SP class.

5.4 Summary

The ASON/GMPLS CARISMA network test-bed has been presented. It consists on three independent planes that communicate each other through a set of interfaces. The architecture and the main characteristics of every module have been presented, aiming a better understanding of the following chapters.

Specifically, the transport plane uses physical sROADMs an OXC emulators, covering a wide range of network topologies.

The sROADM consists of three stages, namely, optical demultiplexer, optical switch or add/drop stage, and optical multiplexer. Optical demultiplexer functions to separate wavelengths in an inlet fiber onto individual wavelengths. These wavelengths are then either dropped, or connected to a 4x2 optical switch, or connected to an optical multiplexer through the optical splitters. The optical splitters allow for multicast connections. The last stage is optical multiplexer which is responsible for aggregating all those wavelengths either added, or coming from the optical demultiplexer, into an output fiber.

The described functionality has been split among several physical cards. Among them, we stress:

The OSNL card is equipped with two optical switches and 2 optical power meters.

The Transponder card integrates client and DWDM transceivers, a PM module, a switch for reconfiguration and an optional booster amplifier.

The Master Card controls the complete optical node and interfaces with elements located in other ASON planes. Besides the standard SNMP protocol, the node implements a more efficient proprietary XML-based protocol.

Each card in the optical node is equipped with a card processor. A serial bus connects the cards in a chassis.


The optical node can be remotely configured through a management interface. In this chapter, we have designed the CCI interface using both, XML and SNMP. Moreover, the control plane information can also be transported as part of the optical signal, through the OSC channel.

At the control plane, every OCC consists on three modules: the LRM, the RC and the CC. Finally, a request generator has been developed.

3

57

Chapter 6

Shared-Path Protection

In the last chapter, we presented the ASON/GMPLS CARISMA network test-bed, the environment where our protection mechanism proposals are tested. In this chapter our proposal to implement shared-path protection (SPP) with extra-traffic in ASON rings provided with a GMPLS control plane is presented. We firstly present the basics to implement SPP on rings. Then, we present our algorithm to compute disjoint paths in presence of extra-traffic. The algorithm performance is compared with a well-known algorithm in terms of blocking probability. The efficiency of SPP can be improved by supporting extra-traffic. In this case protection resources are used to transport this extra-traffic under normal conditions, and it will be preempted in case of failure. Two alternative implementation approaches are discussed, and its performance is evaluated against the dedicated path protection scheme.

6.1 Shared-path Protection (SPP) in rings

SPP has been usually proposed to be implemented over mesh networks, as resource sharing for protection LSPs is only performed among link-disjoint working LSPs. We propose to implement SPP in GMPLS-based DWDM rings. In this case, two different wavelengths are used: one to support working and the other to support protection LSPs respectively. Fig. 6-1a shows two dedicated-protected (DP) connections, 5-7 and 1-4, whose working LSPs do not overlap. If working and protection LSPs for both connections were routed using the same wavelength, SPP could not be implemented in rings since the protection LSP for one connection would use resources already allocated for the working LSP of the second connection. However, when the working LSPs are routed using a common wavelength, i in Fig. 6-1b, protection LSPs can share resources in wavelength k.


Note that protection LSPs are sharing two common resources: links 1-7 and 4-5. Fig. 6-1b also suggests that two additional SP connections, 1-7 and 4-5, could be allocated increasing the resource usage ratio. Generalizing this, all the shared-protected working LSPs using i share k for the protection LSP.

Fig. 6-1 Dedicated protection and shared-path protection in optical rings.

Let us denote W as the number of wavelengths available in each link of the ring. To univocally determine which wavelength has to be used for the route of a protection LSP, we propose to split the set of wavelengths into two bands: wavelengths in the set SPWL = {1, … , W/2} support working LSPs, whereas wavelengths in BEWL = {W/2+1, … , W} support protection LSPs. Therefore, if the routing algorithm chooses i for the working LSP of a connection, then the protection LSP will be routed through W-i+1.

In absence of failures, protection resources can be used to transport best-effort (BE) extra-traffic, which will be pre-empted in case of failure. As resources in W-i+1 have been assigned to protect resources in i, once the working LSP of a SP connection is routed using i, all resources (not only those in the protection LSP of that path) in W-i+1 are available to be used for BE traffic.

i k

1

7

6 3

2

5 4

1

7

6 3

2

5 4

6

i k

1

7

6 3

2

5 4

working routes protection routes

1

7

3

2

5 4

a)

b)

w1

p1

p2

w2

w2

w1 p2

p1

Chapter 6 – Shared-Path Protection 59

6.2 Routing and Wavelength Assignment (RWA)

Being the network deployed without wavelength converters, its topology can be described using a different graph for each wavelength. Let us denote Gi(V, Ei) as the graph describing the resources at the wavelength i.

Recall that every OCC floods the state of the local outgoing data-links, using OLSAs to its control plane neighbor OCCs, every time a data-link is used by a LSP or is released. Therefore, the set of graphs can be dynamically updated to represent the current allocation state of the data-links in the network. Moreover, we use an additional graph G(V, E) which represents the physical network topology independently of the allocation state of the resources.

We have developed the Pre-Computed RWA (PC-RWA) algorithm to compute unprotected routes and link-disjoint path pairs under the wavelength continuity constraint. However, before presenting the PC-RWA algorithm, a general algorithm to compute the min-cost (or shorter) pair of link -disjoint paths between nodes over any topology is review. The algorithm is called Shortest Disjoint Path Pair and uses the Modified Dijkstra algorithm [Bh99]. This algorithm allows shortest paths to be found in graphs with one or more negative link weights but no cycles of negative weight. Negative link weights arise in the intermediate steps of shortest disjoint path pair problem.

To illustrate the shortest disjoint path pair algorithm we make use of the well known “trap topology” [DuGr94] shown in Fig. 6-2a. In a trap topology, the working route may block all the possible link disjoint backup routes although the network topology is biconnected. For example, trying to find the shortest path pair we could search for the shortest path (p1) from node 1 to 4 resulting in the path drawn in Fig. 6-2b. In this case, we will not be able to find any link-disjoint path. According to [LiKa02] trap topologies are found in typical carrier network backbones.

1 2 3 4

5

6

(a)

1 2 3 4

5

6

p1

(b)

Fig. 6-2 Trap Topology

The steps of the algorithm are:

Create the shortest path from 1 to 4: p1 (Fig. 6-2b).

Create negative reverse-directed links from p1 (Fig. 6-3a).

Find shortest path in graph a): p2 (Fig. 6-3b).


Remove interlacing link(s); create path segments p1(a), p1(b) and p2(a), p2(b) (Fig. 6-3c).

Alternate between path segments to construct disjoint path pair (Fig. 6-3d).

1 2 3 4

5

6

(a)

1 2 3 4

5

6

1 2 3 4

5

6

(b)

(c)

p1(a)

p1(b)

p2(a)

p2(b)

p2

1 2 3 4

5

6

(d)

Fig. 6-3 Shortest Disjoint Path Pair algorithm

In order to implement segment protection the graph representing the network has to be broken down in its biconnected components finding the nodes whose single failure would partition the graph into two separate sub-graphs. These nodes are called articulation points (Fig. 6-4). To find all the biconnected components of a graph we use a procedure based on a depth-first traversal of the graph, followed by a backtracking phase. The procedure can be found in [Gr04].

1

2

3

4 5

7

8

9

Articulationpoints

78

9

12

3

45

3

37

Biconnectedcomponents

Fig. 6-4 Breaking down the network graph

To find the shortest path pair between two nodes a network view locating the biconnected components of the network is created. For each biconnected component a new shortest path tree based on the min-hop criterion is built executing Modified Dijkstra algorithm. To compute the route of a disjoint path, Shortest Disjoint Path Pair Algorithm is used to calculate the disjoint path pair in each biconnected


component. Connecting the segments the complete segment disjoint path pair is calculated.

Nevertheless, for its use in SPP, an algorithm must provide resource sharing in the protection routes. To do so, our PC-RWA algorithm computes the disjoint route over the graph G and it translates that route to the corresponding wavelength.

Table 6-1 PC-RWA Algorithm

Procedure PC-RWA (IN Node source, destination)

begin Route w, p ShortestRouteSP (source, destination, w) If length(w) == 0 then

No route found;

Look for single route p link-disjoint with w in G Move p to wavelength W-wavelength(w)+1

Use route w for the working LSP and route p for the protection LSP end

Procedure ShortestRouteSP (IN Node source, destination; OUT Route r) begin

distance = get distance from source to destination in G minDistance = INFINITE

minWL = 0 For each wavelength i in SPWL do

If Gi is not updated then

Update source’s shortest path tree in Gi distanceWL = get distance from source to destination in Gi

If (distanceWL < minDistance) then minDistance = distanceWL

minWL = i end If

If (minDistance==distance) then Break loop

end For

If minWL>0 then Create the route r from source to destination in GminWL

End


The algorithm first calculates the distance between origin and destination nodes over the graph G. This is the minimum distance. Then, it searches for a wavelength i providing this minimum. If there is no wavelength providing the overall minimum distance, the wavelength with minimum distance among all wavelengths is chosen. The working route is then computed over the graph Gi. Table 6-1 shows the pseudo-code for the PC-RWA algorithm.

In the PC-RWA algorithm all wavelengths are numbered, in a similar way as the well known First-Fit (FF) heuristic algorithm ([ZaJu00]). However, solutions obtained with both algorithms may be different. As an example, Fig. 6-5 shows three graphs representing a network: graph G represents the physical topology of the network, whereas G1 and G2 represent the current status of the resources at wavelengths 1 and 2 respectively. The minimum distance between nodes 6 and 3 is 3 hops, as can be observed in the graph G. The distance between those nodes is 4 in G1 and 3 in G2. Note that our PC-RWA algorithm will compute the route over G2 (3 hops), while the FF algorithm would compute the route over G1 (4 hops).

G2

1

7

6 3

2

5 4

G1

1

7

6 3

2

5 4

G

1

7

6 3

2

5 4

Fig. 6-5 Example of a network represented by three graphs.

The performance of both the FF and the PC-RWA algorithms have been experimentally evaluated over the ASON/GMPLS CARISMA network test-bed described in Chapter 5.


Fig. 6-6 compares the performance of the First-Fit (FF) heuristic against the PC-RWA in terms of blocking probability as a function of the offered traffic for a five-node ring with 20 wavelengths per link. The offered traffic ranges from 1 to 10 Erlangs/node. The blocking probability for both algorithms is negligible when the offered traffic is low. However, when the offered traffic increases, the blocking probability for the FF heuristic becomes slightly higher than for the PC-RWA algorithm.

Offered SP and BE Traffic (Erlangs/Node)

Blocking Prob.

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

1 2 3 4 5 6 7 8 9 10

FF

PC‐RWA

Fig. 6-6 Performance of PC-RWA vs. FF.

The response time of the PC-RWA algorithm is accelerated by having the shortest path tree pre-computed for every wavelength graph. On the reception of OLSAs updating or deleting data-links, each OCC recalculates the shortest path tree in the corresponding wavelength graph. As a consequence of a LSP signaling (set-up or tear-down) a group of OLSAs is generated and thus, a short time of convergence needed. Therefore the shortest path computation is not performed immediately after the reception of a single OLSA but it is delayed a short time allowing a group of OLSAs for a hypothetical LSP to arrive. If a request arrives and the shortest path tree is not updated in any graph Gi, the shortest path tree is computed at that time. Note that if the shortest path trees are updated at the time of computing a disjoint pair of routes, the computational complexity of the PC-RWA algorithm is given by the computational complexity of the modified Dijkstra algorithm, O(|V|log|V|) [Bh99], which is used to compute the protection route. Therefore, the PC-RWA algorithm provides constant computation times independently of the wavelength assigned.


6.3 SPP with Extra-traffic in ASON/GMPLS Rings

Our proposal to implement SPP in ring-based networks has been introduced in Chapter 6. Recall that it consists on splitting the set of wavelengths available in each link into two bands: wavelengths in the set SPWL={1, … , W/2} support working LSPs, whereas wavelengths in BEWL={W/2+1, … , W} support protection LSPs.

To implement SPP we follow the GMPLS recovery framework [RFC-4872, RFC-4873]. In this framework three protection schemes are specified: 1+1 dedicated protection, 1:N (N >= 1) LSP protection with extra-traffic, and pre-planned rerouting without extra-traffic. In the 1:N scheme, N working LSPs (having the same origin and destination) are protected by one LSP. In the pre-planned LSP rerouting, two disjoint LSPs are established between the end nodes: the working and the protection LSP. The working LSP is implemented in the transport plane, while the resources of the protection LSP are only pre-reserved in the control plane and therefore, an explicit signaling is required to instantiate them in the transport plane. This gives the opportunity of reusing the protection reserved resources to accommodate extra-traffic without the constraint of sharing the same origin and destination nodes.

In our implementation the SPP scheme uses the pre-planned LSP rerouting. Being the working LSP of a shared-protected connection affected by a link failure, the protection LSP is signaled and activated in the transport plane. Taking advantage of its shared nature and that the protection LSPs are created in a failure-driven way, it is possible to reuse the protection capacity to transport extra-traffic. In the event of a link failure, the extra-traffic can be preempted to accommodate the working traffic to be protected.

Several approaches to transport extra-traffic can be implemented using the pre-planned LSP rerouting: The first approach consists on only transporting extra-traffic over the resources reserved for protection LSPs of already established SP connections. For example, when the connection 1 is established in the ring in Fig. 6-1b, only resources used by the protection LSP (p1) could be used for extra-traffic. When the SP connection is torn-down, the extra-traffic must be also torn-down. Moreover, as several SP connections share protection resources, specific per-resource reserve counters must be maintained to know whether a resource can be used for extra-traffic. Therefore, the overhead introduced by this approach appears to be too high.

The second approach which can be considered consists on reserving all resources in the whole ring in wavelength W-i+1 to accommodate extra-traffic, when i transports, at least, one working LSP. As an example of this, when the connection 1 is established in the ring in Fig. 6-1b, the working LSP (w1) is established in the transport plane using resources in i. Resources of the protection LSP (p1) in k


(k=W-i+1) are reserved in the control plane to be used in case of failure of w1, and thus all resources in k can be used for extra-traffic. At the time when no resources in i are used for SP working LSPs, all extra-traffic in k must be torn-down. Note that in this approach, only specific per-wavelength reserve counters need to be maintained. The overhead introduced by this approach is much lower since only W/2 reserve counters are needed.

Finally, considering that resources in BEWL are all reserved for protection LSPs, all these resources could be used to transport extra-traffic. In such case, extra-traffic would be completely de-coupled from SP traffic; the extra-traffic would use all that resources, and reserve counters would not be required. Therefore this approach does not introduce any overhead.

We have implemented both the second (per-wavelength) and third (full-band) approaches. A network without wavelength conversion which is able to deal with two classes of service (the shared-protected service (SP) and the best effort (BE) preemptable service) is deployed and the blocking probability obtained with both approaches (and with DPP) is compared.

We use the PC-RWA algorithm previously introduced to compute disjoint path pairs in presence of extra-traffic. Note that the PC-RWA algorithm computes the disjoint route over the generic graph G and then, it moves the route to the corresponding protection wavelength. Therefore, it can be used when the resources in the set of wavelengths assigned to protection are used for extra-traffic. For the working route, the algorithm searches routes within SPWL.

Shortest route for a best-effort connection is computed in a similar way as working routes for shared-protected connections. In this case, the search is performed within BEWL.

6.3.1 SPP with extra-traffic Implementation

In the per-wavelength approach, data-links belonging to wavelengths in SPWL are advertised as free and with available bandwidth for SP traffic. However, data-links in BEWL are also advertised as free but with no available bandwidth for any traffic. This allows the routing algorithm in the OCCs to know the network topology at every wavelength, while prevents using those resources to route best-effort traffic.

When a new SP connection is signaled, the origin OCC sends two RSVP-TE Path messages: one for the working LSP and another for the protection LSP. Working LSP is signaled with label i, whereas protection LSP is signaled with W-i+1. When the Path messages arrive to the destination node, Resv messages are originated and the resources are allocated for the working LSP and reserved for the protection LSP. At this point, all OCCs in the ring know that local resources in W-i+1 can be


used for extra-traffic and are advertised with available bandwidth for the BE class through OLSAs to all nodes in the ring.

To properly manage whether resources in W-i+1 can be used for extra-traffic, we use a per-wavelength reserve counter. Every SP connection establishment will increment that counter for local resources in W-i+1. On the contrary, the counter will be decremented in the tear-down process. When a SP connection is torn-down, the origin OCC sends RSVP-TE Path-Tear messages for the working and the protection LSPs. If this shared-protected connection is the last one using i, best-effort traffic using W-i+1 must be de-allocated. Upon the reception of a Path-Tear message for a working LSP in i or a protection LSP in W-i+1, each OCC decrements the reserve counter for resources in W-i+1. When this counter is 0 and the resource is free it will be advertised with no available bandwidth, preventing this way to be further used for BE traffic. If the resource is allocated, a notification will be sent to the origin OCC containing the LSP’s session Id. Upon the reception of this message, the origin OCC should send a tear-down request for that BE-class connection and notify the client through the UNI interface [RFC-4208]. At the end of this process all resources in W-i+1 will be released and advertised with no available bandwidth for best-effort traffic.

Note that the per-wavelength approach introduces an overhead in the GMPLS control plane. On the one hand, a RSVP-TE notification is needed when protection resources are used for best-effort and the per-wavelength reserve counter is 0. In such case, a tear-down RSVP-TE signaling for those LSPs is also needed. However, considering that these forced torn-downs are very infrequent with respect to the number of normal set-ups and tear-downs, the introduced overhead is negligible. On the other hand, an additional overhead is introduced in the routing process since OSPF-TE OLSAs need to be originated to advertise the data-links in a k with available bandwidth for the BE class. Note in this regard that the PC-RWA algorithm must recalculate the shortest path tree upon the arrival of a group of OLSAs. However, since the shortest path tree is computed upon the reception of the OLSAs, and not when a new route needs to be computed, its performance is not affected. No additional OLSAs are generated when a reserve counter is decreased to 0. Therefore, we can conclude that the per-wavelength introduced overhead is limited to the OSPF-TE advertisement produced when the first SP connection using resources in a i is set-up. The number of OLSAs generated in such case is 2*n, standing n for the number of nodes in the ring.

In the full-band approach, all data-links are advertised as free and with available bandwidth. The PC-RWA algorithm looks for available resources in the set of wavelengths corresponding to the class of service requested. Since BE traffic can use all resources in BEWL, no reserve counters are needed, and no additional flooding for resources in BEWL is needed. As a consequence, the above discussion about the introduced overhead is not applied to this approach.


6.3.2 Performance Evaluation

For comparison reasons, we have also implemented the DPP scheme with extra-traffic, using the 1:N (N = 1) LSP protection with extra-traffic, specified in [RFC-4872, RFC-4873]. In this scheme, the working and the protection LSPs are signaled and effectively activated in the transport plane. When a link failure affects the working LSP of a DP connection, the nodes adjacent to the failure send RSVP-TE Notify messages to the end nodes. Upon the reception of a Notify message the end nodes switch the traffic on the working LSP to the protection LSP and the extra-traffic is preempted.

Several differences between SPP and DPP can thus be found: 1) As a consequence of the fact that only the end nodes have to switch traffic on the event of a failure, DPP provides faster protection times than SPP; 2) Working and protection LSPs are routed using the same wavelength in the dedicated scheme, whereas different wavelengths are used in the shared scheme; 3) In the DPP scheme the end points of the extra traffic are predetermined by the protection LSP. In the SPP scheme extra-traffic and protected traffic are de-coupled; 4) Due to its shared nature, SPP provides better resource usage than DPP.

The performance of DPP and SPP with extra-traffic (both the per-wavelength and the full-band approaches) has been experimentally evaluated over the ASON/GMPLS CARISMA network test-bed described in Chapter 5.

Blocking Prob.

Offered SP and BE Traffic (Erlangs/Node)

0%

5%

10%

15%

20%

25%

1 2 3 4 5 6 7 8 9 10

DPSPBE (DPP)BE (SPP per‐wavelength)BE (SPP full‐band)

Fig. 6-7 Blocking probability against SP traffic load.

Fig. 6-7 shows the blocking probability as a function of the offered traffic for a five-node ring with 40 wavelengths per link (C band). The offered traffic ranges from 1


to 10 Erlangs/node of SP (or DP) and BE traffic. The results are plotted with its 95% confidence interval, so that the accuracy of the results can be appreciated.

The blocking probability for DP and SP traffic is negligible when the offered traffic is low. However, when the offered traffic increases, the blocking probability for DP traffic becomes higher than for the SP traffic. Note that in the DP scheme, one protected connection uses all resources in a wavelength. In the case of SPP, although only half or the wavelengths are used for the working LSPs, it is possible that more than two working LSPs are being simultaneously transported in one wavelength. This clearly shows the advantage of the SPP scheme.

Moreover, the SPP scheme (even the per-wavelength approach) provides higher flexibility in the use of spare resources to transport best-effort traffic. Served BE traffic in the per-wavelength approach of the SPP scheme is conditioned by the availability of resources in the wavelengths used for protection; served BE traffic in the DPP scheme is conditioned by the availability of protected connections with the same origin/destination nodes. This flexibility provided by the SPP scheme marks the difference in terms of blocking probability. Therefore, when the offered traffic is low, the BE traffic presents high blocking probability in both schemes. However, the DPP scheme provides the worst blocking probability since it is highly improbable to find protected connections with the same origin/destination nodes as the BE connection requested, which are not being used for extra-traffic. When the offered traffic is high, the blocking probability decrease being closer to the corresponding protected (SP or DP) traffic, as more resources are available to be used for extra-traffic, following the upward tendency of their protected traffic under highly loaded traffic.

The blocking probability obtained BE traffic under the per-wavelength SPP approach, although much lower than with DPP, is still high. However, when considering the full-band SPP approach, the blocking probability for the BE traffic is coincident with the SP blocking probability, since the same resources are available to be used for both traffic classes.

Therefore, the full-band approach shows clear advantages with respect to the per wavelength approach.

6.4 Summary

In this chapter we have discussed the shared-path protection with extra-traffic for ASON/GMPLS optical rings. At the beginning of this chapter our proposal to implement SPP in rings has been detailed.

In regard with routing, the PC-RWA algorithm has been presented. Their performance has been compared against the well known First Fit algorithm, proving a better performance in terms of blocking probability. Besides unprotected


routes, the PC-RWA algorithm computes disjoint path pairs and provides constant computation times independently of the wavelength assigned.

Two different approaches have been implemented: the per-wavelength and the full-band. In the former, extra-traffic is transported by resources in wavelengths being used by any SP connection for the protection LSP. In the latter, all resources in half of the wavelengths on all links are reserved for protection, and thus can be used for extra-traffic.

The SPP scheme with extra-traffic over performs the DPP scheme in terms of blocking probability for both, the SP and the BE traffics. Moreover, the full-band approach provides better blocking probability to the BE traffic, and shows clear advantages with respect to the per-wavelength approach.

3

71

Chapter 7

ROADM Design and Protection Time Model for SPP

In this chapter the protection time provided by the SPP scheme, implemented in the previous chapter, is analyzed as a function of the switching time of the WSS, the key components to build reconfigurable optical nodes. We demonstrate that the switching time of the currently available WSSs prevents from protecting the complete set of affected LSPs within 50 ms after fault detection. Therefore, two classes of protection with different requirements in terms of protection time can be defined.

7.1 ROADM Design

In order to support SPP with extra-traffic, we have designed the OADM shown in Fig. 7-1. The basic components are splitters/couplers (S) and WSSs. The incoming optical signal in the East and West ports can either pass-through or be dropped to any port. The local traffic can be added either to the East or to the West outgoing signals. Note that additional hardware is required to monitor the incoming optical power in order to detect failure conditions.

In the case of a link failure, the adjacent OADMs, once detected the LoL, notify the failure to the OCCs in the GMPLS control plane. Then, for each LSP to be protected, the OCCs notify the failure to the OCC of the closest end node (origin or destination) from the failure by sending an RSVP-TE Notify message. The address of the node to be notified was received in the NOTIFY_REQ object in the RSVP-TE Path/Resv message. When the source OCC receives the Notify message, the signaling of the protection LSP starts. It consists on sending a Path message to


eliminate the extra-traffic from the resources required by the protection LSP, and sending Path/Resv messages to effectively activate the protection LSP.

WSS

S‐5

LocalAccess

S‐4 S‐3

S‐2

WSS

S‐1

WSS

WSS

S‐6

We

st

Ea

st

WSS WSS

1

2

3

5

4

6

Fig. 7-1 OADM design to support SPP with extra-traffic.

One command to create/eliminate a connection in the OADM (Fig. 7-1) implies a set of commands to the WSSs, which will be sequentially processed in case of simultaneous requests.

7.2 Fault localization

Before a protection time model can be developed, it is necessary to define the mechanism used to localize a failure. This section compares two alternative solutions for fault localization.

In the event of a failure, the optical transparency in the data plane yields to downstream propagation of LoL alarms, as all downstream nodes from the failure point on detect LoL on its incoming ports. To suppress multiple alarms stemming from the same failure, localization procedures should be implemented in transparent optical networks. One of the main benefits of the out-of-fiber control plane configuration is that control channels remain alive despite data plane failures. Taking advantage of such particularity, LMP implements a simple fault localization procedure. Note, however, that the applicability of such procedure is restricted to the out-of-fiber configuration. On the contrary, fault localization in in-fiber out-of-band configurations usually relies on hardware-based solutions.

Aiming to validate the applicability of LMP as a fault localization protocol, we experimentally compare failure localization times obtained with standard out-of-fiber LMP localization procedures against a hardware-based solution that fits

Chapter 7 – ROADM Design and Protection Time Model for SPP 73

either in-fiber out-of-band or out-of-fiber control plane configurations. Specifically, the hardware-based solution is based on sending and detecting a pilot tone on the endpoints of each link, thus having whole link granularity. Contrariwise, LMP provides wavelength granularity, serving for path protection/restoration purposes.

The optical pilot-tone based procedure uses an additional optical transponder to send and receive the OSC channel (e.g. in 1300 nm) which is multiplexed with the bundle of DWDM-multiplexed signals (Fig. 7-2a). An optical power meter monitors the incoming pilot tone power level. Whether a link is not affected by a failure, optical power must always be received at each end of the link. In the event of receiving out-of-bound power levels, the node sends a LoL notification through the CCI, meaning that a link failure has occurred.

a) b)

Mx

Dem

x

M

Mx

Dem

x

M

OSC OSC

OADM

link granularitypower meters

M

M

OADM

lambda granularitypower meters

Fig. 7-2 a) Additional hardware needed for the Optical pilot tone and b) for LMP-based fault localization.

Let us denote tlocaliza as the time from the out-of-bound power level detection in the OADM until the failure is localized, and tCCI as the communication time between the OADM and the OCC. Using this procedure, we can express this time as tlocaliza = tCCI. Then, in the optical pilot tone based procedure, the time to localize a failure in the control plane remains constant, independent of the network topology. Recall that optical power meters in this solution have monitoring sweep times of 10μs, thus assuring very fast detection.

The LMP-based localization procedure also uses optical power meters on each input port to monitor the incoming optical power level (Fig. 7-2b). In this case, in order to obtain better granularity, we use an arrayed power meter which is able to monitor every single wavelength in parallel. Upon the reception of out-of-bound power level in either one or multiple wavelengths, the node sends a LoL notification for these wavelengths though the CCI interface. Upon receiving the LoL notification, the OCC should determine whether the failure is in the local link or in any upstream link (Fig. 7-3). To this end, it sends a ChannelStatus message [RFC-4204] to its upstream neighbor with the list of individual failed data links (or the complete link if all data links have failed). Upon reception of a ChannelStatus message, the neighboring LRM process checks the status of the data links associated to the


failed ones through a local LSP connection. If the reception of the associated data links is OK, the failure has been localized in this link; on the contrary, another upstream link may be responsible for the failure. Finally, a ChannelStatus message is sent to the downstream neighbor with the status of the data links.

OXC-3

OXC-4

OXC-1

OXC-2

OCC-2 OCC-3OCC-2

LoL LoLLoL

ChannelStatus

ChannelStatus

ChannelStatus

ChannelStatus

Fig. 7-3 LMP Failure localization

Let us denote tlink as the propagation delay in each link, and tLMP as the time to process a single LMP message. Using this procedure, localization time stays as

LMPlinkCCIlocaliza tttt *2 (7.1)

Measuring times in distributed environments is not a simple task, since some protocol (e.g. Network Time Protocol, NTP) must be used to provide global clock synchronization. In the CARISMA network test-bed we have developed a special software module called Accounting Manager (AM) to measure times. The task of the AM module is simple, it receive UDP messages from different applications running in a distribute environment, and get the time from the local machine. A simple protocol indicates the origin of the message and the task done. The AM module correlates the received messages and gives us the differential times for the process under test.

Using the AM module we can experimentally measure the processing time at every module. With the aim of exemplify the measuring process, let us to detail the actions involved during the failure localization process under the optical pilot-tone based procedure, assuming emulators as optical nodes. Starting with a protected SNC already configured in the OXC, an external application sends a failure message to inform the OXC about a failure in the input working CP belonging to the protected SNC created. The OXC detects the failure and send a SNMP trap message to the OCC in the control plane. The OCC (the LRM module) takes the


decision of perform a protection switching and sends a command to the OXC to switch the SNC selector to the protection CP (recall that a protected SNC contains a selector to choose between working and protection signals). The OXC executes the received command. In every step of the process the involved applications send messages to the AM module indicating the task done, and we obtain tCCI = 1ms. Using similar procedures, we have measure every component processing time. The times used in the following are average times resulting of large number of tests.

In our implementation we have measure tLMP = 0.2ms. However, tlink depends on the length of the links (L). For example, if we consider metropolitan networks with L=30km, tlink = 0.15ms, whereas in larger networks with L=100km, tlink = 0.5ms. As a result, tlocaliza ranges from 1.7ms to 2.4ms in such scenarios. Besides, the arrayed optical power meters used with this solution have a higher monitoring sweep time of 1ms. Table 7-1 presents a summary of the localization times for the two alternatives.

Table 7-1 Localization times

Optical pilot tone LMP-based

tlocaliza = tCCI tlocaliza = tCCI + 2* (tlink+ tLMP)

tlocaliza 1ms tlocaliza ranges from 1.7ms to 2.4ms for metropolitan and regional networks.

Optical power meters with a monitoring sweep time of 10s

Arrayed optical power meters with a monitoring sweep time of 1ms.

As seen, LMP detection and localization times are generally higher than the ones achieved by the optical pilot-tone based procedure. However, even in large network topologies, their contribution to the total protection time (e.g., 50ms) could be assumed as marginal. This strongly leverages the applicability of LMP. Note, that the pilot tone-based procedure becomes essentially limited to link recovery (due to its coarse granularity). Nonetheless, the finer granularity of LMP makes it applicable not only to link recovery but also to path recovery, thus giving support for a broader protection/restoration scope.

7.3 Protection Time Model

Using the previous OADM design as a node, in this section we present a mathematical expression to model protection times provides by the SPP scheme. Although the two options described in the previous section for fault localization could be used, for the sake of simplicity the optical pilot tone option has been chosen to model protection times.


Let us define the protection time (tSPP) in an n-node ring with pre-planned rerouting, as the interval from the failure detection to the completion of the switching operation (for each single connection to be protected). Let us denote tOCC as the time to process a single RSVP-TE message, and tnodeSwitch as the OADM switching time. tnodeSwitch includes the switching times of the physical device plus the processing time at the optical node upon receiving a command through the CCI interface (tnode + tswitch).

We determine the expression of tSPP considering two extreme cases for the LSPs to be protected: a) The origin and destination nodes are adjacent to the failure (hereafter adjacent LSP); b) The LSP origin node is adjacent to the failure, while the destination node is the farthest one (maximum number of hops). Note that the OCCs adjacent to the failure are notified by their associated OADMs after a tCCI interval.

Fig. 7-4 presents an example of the signaling involved after a failure. In this example a 5-node ring is considered, where four LSPs need to be protected after a failure in the link C-D. The LSP C-D is an adjacent LSP. One of the end nodes of the LSPs A-C and F-C is adjacent to the failure. The LSP E-B is a pass-through LSP. In Fig. 7-4 the length of each bar is proportional to its processing time. Finally, the left dotted line below every node represents activity in the control plane while the right dotted line represents activity in the optical node.

C‐D

A B C D E F A

E‐BA‐D

F‐C

Fig. 7-4 SPP with extra-traffic time model.

For the adjacent LSPs, the RSVP-TE signaling has to travel from one OCC to the adjacent one using the opposite side of the ring. Each OCC has to process the Path


and Resv messages and send configuration messages to its OADM to perform the switch. Let us denote ra as the number of adjacent LSPs to be protected. To obtain the average case, let us assume that half of these LSPs have their origin in one of the adjacent nodes, while the rest have their origin node in the other node adjacent to the failure. Hence, ra/2 connections will be serially processed on each of the adjacent OADMs.

For the LSPs with one end in one node adjacent to the failure, the RSVP-TE signaling messages travel from the origin OCC to the destination OCC through the opposite side of the ring. Let us denote r as the total number of LSPs to be protected.

All LSPs, independently from its destination nodes, have their protection route through the nodes which are n/2-1 hops distant from the failure adjacent nodes. Therefore, those OADMs have to perform r connections in a serial basis. Depending on the tnodeSwitch and r values, the effect of the serial processing of connections can be slower than the propagation delay around the ring. We can express the time to protect as:

nodeSwitchaOCClink

nodeSwitchOCClink

OCCCCI

SPP

trtntn

rtttntt

t2/)22()32(

))(12/(max

2 (7.2)

The first two terms are the time needed for the detection of the failure and to send the first switching command from the control to the transport plane (2tCCI), and the process time in the OCC adjacent to the failure. The max{} function captures the maximum of two terms: the time to protect the adjacent LSPs around the ring, and the time to protect all the affected traffic due to the coincidence of multiple connections in some nodes.

7.4 Performance Evaluation

The performance of shared-path protection with extra-traffic in terms of protection time has been experimentally evaluated over the ASON/GMPLS CARISMA network test-bed described in Chapter 5. Using the measure procedure explaining above, in our implementation, we have obtained the times presented in Table 7-2.

Table 7-2 Experimental times

tLMP Time to process a single LMP message 0.2 ms

tOCC Time to process a single RSVP-TE message 0.5 ms

tlink(L) Propagation delay in every link 0.5 ms/100km

tCCI Communication time between the OADM and the OCC

1 ms


Fig. 7-5 shows the time needed to protect 20 LSPs (we assume links with 40 wavelengths), as a function of tnodeSwitch, for different link lengths (L) and number of nodes (n). As shown, when tnodeSwitch is low, propagation and control plane processing times are dominant on the protection time. However, when tnodeSwitch increases, the connections serial processing is the dominant effect. Thus, to protect 20 LSPs we need tnodeSwitch to be lower than 1.8ms, even for small rings (where propagation time becomes negligible).

Pro

tec

tio

n t

ime

(m

s)

20

30

40

50

60

0,3 0,6 0,9 1,2 1,5 1,8 2,1 2,4

L=100; n=10

L=100; n=15

L=100; n=20

L=30; n=10

L=30; n=15

L=30; n=20

tnodeSwitch (ms)

Fig. 7-5 Protection times against switching time

Fig. 7-6 shows the protection time as a function of the number of LSPs to protect, assuming a medium-size long-haul 15 nodes ring with 100 km links, and for several tnodeSwitch values.

Pro

tec

tio

n t

ime

(m

s)

30

40

50

60

2 6 10 14 18 22 26 30

tswitch=1 ms

tswitch=2 ms

tswitch=3 ms

tswitch=4 ms

tswitch=5 ms

L=100; n=15

number of LSPs to protect (r)

tnodeSwitch = 1ms

tnodeSwitch = 2ms

tnodeSwitch = 3ms

tnodeSwitch = 4ms

tnodeSwitch = 5ms

Fig. 7-6 Protection times against the number of LSPs to protect


As shown, the higher is tnodeSwitch the smaller is the number of LSPs that can be protected within 50ms after the failure detection. For example, with tnodeSwitch =4ms it is possible to protect only 9 LSPs within 50ms.

Note that tnodeSwitch includes the physical WSS switching time and the time to process a request in the OADM. Currently available WSSs provide physical switching time (tswitch) close to 2ms, and the latest technology will provide commercial components with sub-millisecond physical WSS switching time [Go06, Go08]. However, telecom equipments are usually based on cards, where one card represents the interface with the control plane, and another card includes the WSS component. Cards are interconnected through buses. In the card holding the WSS component, one specific command has to be generated. In our implementation (Chapter 5) we have measured the time from the reception of the command from the control plane to the WSS command is generated (tnode) as about 1.5ms. Thus, assuming switching times of currently available WSSs, the actual OADM switching time is about 3.5ms. As a conclusion, the number of LSPs which can be protected within 50 ms is limited to 11.

On the basis of the previous experimental results, classes of protection (CoP) can be defined, as a function of the protection time (Table 7-3) to be guaranteed. This is similar to those defined in [BoKu01] for a mesh network. Due to the switching time restriction, we can dedicate up to 11 wavelengths to the SP-50 class, 9 wavelengths to the SP-100 class, and 20 wavelengths to the BE class.

Table 7-3 Classes of Protection (CoP)

Service Description Protection time

SP-50 (Shared-path) Protected service. < 50ms.

SP-100 (Shared-path) Protected service. 50 - 100ms.

BE Best-effort service. Preemptable when resources are needed for protection of the SP-x class.

Repair time.

Additionally, the number of sub-50ms protected connections can be increased by using OMS protection. This is evaluated in the next chapter.

7.5 Summary

To implement SPP in GMPLS-controlled optical rings, the proper design of the optical nodes (i.e., OADM) has been done. We experimentally have found 1.8ms as the maximum node + switching time which provides SPP protection within 50ms to the maximum of LSPs per fiber when extra-traffic is supported. Moreover, it has been demonstrated that the number of LSPs that SPP can protect within 50ms is 11 for the switching times provided by currently available WSSs. This opens the


possibility to define two classes of protection: the SP-50 class where connections are recovered in less than 50 ms, and the SP-100 for protection times under 100ms.

However, may be necessary to provide protection times within 50 ms to the complete set of LSP. In the next chapter we provide protection at the OMS layer, protecting the complete set of LSPs with only one protection action, eliminating thus the serial processing effect view in the present chapter.

3

81

Chapter 8

OMS Protection in ring-based networks

Whit the objective of provide protection within 50ms, in this chapter we introduce OMS Protection in GMPLS-based optical ring networks. At the OMS layer, protection action is performed by the ROADMs adjacent to the failure; with just one protection action the complete bundle of DWDM channels in a fiber can be recovered. Therefore, this protection scheme is a good option to be used on the core network, where most of the optical channels need to be protected.

The focus of this chapter is on dynamic optical rings supporting either dedicated or shared link protection (hereafter OMS DPRing and OMS SPRing respectively). The OMS DPRing scheme is deployed over two-fiber unidirectional rings; one fiber is dedicated to the working traffic while the other is reserved for protection. The OMS SPRing scheme is deployed over two-fiber bidirectional rings. The total capacity of each fiber is thus divided in two wavebands: one waveband is reserved to transport working LSPs, while the other is used to transport protection LSPs. Working and protection LSPs share each fiber in this case.

Two complete solutions to build ring-based dynamic optical networks with OMS DPRing and OMS SPRing protection capabilities are proposed and evaluated. Both proposals consist of: 1) the novel GMPLS Automatic Protection Switching (GAPS) mechanism which coordinates the protection actions after failures, and 2) a new Reconfigurable Optical ROADM design to support OMS protection.


8.1 GMPLS-controlled OMS protection

In this section the GAPS mechanism is presented. It is an APS-like protocol, to be used for fault management purposes in OMS protection, running in a GMPLS-based control plane. GAPS is based on extending the current standardized LMP protocol. First the mechanism to control the OMS DPRing protection scheme is introduced and then, it is extended to control also the OMS SPRing scheme. Protection time models for GAPS-controlled OMS protected rings are the defined.

8.1.1 The GAPS mechanism

The APS protocol for SDH is defined in [G.841]. It is supported by bytes K1 and K2 of the SONET/SDH section overhead. We define GAPS as an IP-based protocol running in the control plane of ASON/GMPLS networks. GAPS protocol is only responsible for protection messages signaling, being complementary to the automatic provisioning functionalities in the control plane.

A link failure implies a Loss of Light (LoL). A failure on a link is detected and corrected by its adjacent nodes. Those nodes are called switching nodes and they use the bridge and switch actions for the protection of the working channel, as shown in Fig. 8-1.

LOL

OCC OCC

LoLBridge Switch

Fig. 8-1 Actions performed by the switching nodes.

As stated in Chapter 3, the OMS DPRing configuration consists of two counter-rotating rings. Working links in the transport plane carry regular traffic in the normal state (Fig. 8-2). When a network component fails, a switch event occurs and the working link is protected using backup links. Let us assume OMS DPRings be remotely controlled by a GMPLS control plane, which can be transported out-of-band in-fiber or out-of-fiber [RFC-3945]. For the sake of simplicity, we assume that the topology of both control and transport plane is the same (Fig. 8-2), although the GAPS mechanism would work for any control plane topology independent from the transport plane one.

Chapter 8 - OMS Protection in ring-based networks 83

Node D

Node C

Node B

Node A

OCCD

OCCC

OCCB

OCCA

CCI

I-NNI

Control Plane

Transport Plane

Protectingchannel

Workingchannel

Fig. 8-2 GAPS-controlled OMS DPRing under normal conditions.

At boot time, GAPS agent checks for the current alarms in the ROADM/optical node. If no alarms affecting working channel are present in the ROADM it sends a message to its neighbors indicating the normal state and waits for receiving the same message from them. After that, no more messages are sent before a fault condition is detected or an external command is issued. This way, GAPS relies on LMP’s Hello mechanism for keep-alive messaging purpose. Functionally, this is similar to K1/K2 bytes in the SONET/SDH APS protocol. GAPS messages transport the information described in Table 8-1.

Table 8-1 Information Transported by GAPS Messages

Source/ Destination Node ID

Identifies the origin/destination nodes for this GAPS message. Depending on the semantic for the message type, origin and destination nodes represent head and tail nodes or vice versa.

Path Indicates whether the message is being sent to the short or the long path.

Request Type Indicates the type of request. A request can be a condition (LoL), a state (normal) or an external request (not covered in this thesis).

Status Indicates the status of the protection switch.

After the detection of a LoL, the failure must be corrected by its adjacent transport nodes. These nodes, called switching nodes, use the bridge and switch actions for the protection of the working link (Fig. 8-3). Specifically, when an optical node detects a LoL, it notifies the failure to its corresponding OCC in the GMPLS-based control plane, which becomes the head end. It notifies the failure detection to the


OCC (tail end) corresponding to the other adjacent optical node, which executes a bridge.

Node D

Node C

Node B

Node A

workingchannel LoL

1311

9PassthroughPassthrough

6

Bridge Switch

1

HEAD

TAIL

Ring bridgerequest D

(long Path)

32


(short path)

4

5


(long Path)Ring bridgerequest D(long Path)

7

Tail Bridged(short path)

Tail Bridged(long Path)

10

12 Tail Bridged(long Path)

8

Tail Bridged(long Path)

OCCD

OCCC

OCCB

OCCA

Switch

Bridge

LOL

Fig. 8-3 OMS DPRing after a LoL detection.

To illustrate how the GAPS mechanism works, firstly, Fig. 8-4 shows the recovery from a link failure. The initial state of the ring is the normal state. In this state (T0), all OCCs in the ring have exchanged Normal State messages 1-4 with their neighbors.

At time T1, node A detects a LoL on its working link and notifies it to its OCC. When an OCC receives a notification of failure detection, it sends a switching request (i.e., GAPS messages) to the OCC of the adjacent node over the control network on both the short and the long path. The short path connects head and tail OCCs directly, while the long path connects them through the intermediate OCCs using the opposite side of the ring. Node A becomes then a switching node and its OCC becomes the head end. Head end OCC sends a bridge request. All intermediate OCCs on the long path enter full pass-through state. OCC D, upon reception of the bridge request from OCC A on the short path, transmits a LoL ring bridge. OCC D, upon reception of the bridge request from OCC A on the long path, executes a bridge, and updates its status. OCC A, upon reception of the ACK from OCC D on the long path, executes a ring switch, and updates its status. Signaling reaches then the steady-state.

At time T2, the LoL clears. Node A notifies this to its OCC, and OCC A enters the Wait-To-Restore (WTR) state, advertising its new state to the OCC D. Upon the reception of the WTR bridge request on the short path, OCC D sends out a message with the WTR code.


OCCD

OCCC

OCCB

OCCA

OCCC

T0 1a 1b 2a 2b4a 4b 3a3b

NodePassthrough

NodePassthrough

NodeBridge

NodeSwitch

5a 5b

5b5b

6b

6a6b

6b7b 7a

8a 8b

7b

7b

8b

8b

T1LOL/A

LOL/A6b

6a

LOL/A

LOL/A7b

7a

LOL/D

LOL/D8b

8a

WTR/D

WTR/D9b

9a

NR/D

NR/D

D/S/IDLE

D/L/IDLE

D/S/Br

D/L/Br

A/S/RDI

A/L/Sw

A/S/Sw

A/L/Sw

A/S/Sw

A/L/Sw10b

10a

NR/D

NR/B1b

1a

NR/A

NR/C2b

2a

NR/B

NR/D3b

3a

NR/C

NR/A4b

4a

LOL/D

LOL/D

A/S/IDLE

A/S/IDLE

B/S/IDLE

B/S/IDLE

C/S/IDLE

C/S/IDLE

D/S/IDLE

D/S/IDLE

A/S/RDI

A/L/IDLE5b

5a

LOL receivedEnter switching state

Enter switchingstate

10a 10b

10b10b

NodeNormal

4b 4a

4b

4b

NodeNormal

NodeNormal

NodeNormal

WTR expires

1a 1b

2a 2b

3a3b

T3

LOL clearedWTR starts

WTR Dest.

9a 9b

9b9b

6b

7a6b

6b

T2NR: No RequestLOL: Loss of LightWTR: Wait To Restore

RDI: Remote Defect IndicationBr: BridgedSw: Switched

S: Short pathL: Long path

Retransmitted Message

Generated Message

Node ANode B

Node C

Node D

Fig. 8-4 Failures management: GAPS messages

At time T3, the WTR interval expires. OCC A sends out a No Request message. OCC D, upon reception of the No Request from OCC A on the long path, drops its bridge, and generates the Idle code. OCC A, upon reception of the Idle code on the long path, drops its switch and also generates the Idle code. All OCCs return then to the normal state.

Since the protection channels are shared among all links, contention among the nodes may arise when multiple simultaneous failures occur. In these cases the request with the lowest head node identifier (ID) has priority. This mechanism is


useful in OMS SPRing, where some LSPs can continue working in a double-failure scenario.

If in-fiber signaling is used, GAPS mechanism is able to efficiently manage also node failures. In this case, when a LoL is detected, the detecting OCC includes the identifier of the destination node in the GAPS message. If the GAPS message reaches the other end of the failure (i.e., the destination OCC), it will find itself as the destination OCC, in which case the failure was indeed a link failure as assumed. On the contrary, a node failure will be assumed if the message reaches an OCC adjacent to the destination and the destination OCC is unreachable. In the latter, the OCC adjacent to the failure node will act as the destination assuming its protecting role.

A simplified finite state machine for the GAPS mechanism is illustrated in Fig. 8-5. When in-fiber signaling is used, messages that head and tail end OCCs would exchange through the short path are never received. The transition from the normal state to the ring bridge destination state can be done both with an intermediate transition upon reception of the request message through the short path, or directly upon reception of the request through the long path. Receiving the request message through the short path allows accelerating the switching process by preparing the optical node. The same can be done in the intermediate nodes upon reception of the bridge request directed to the tail end.

normalring

switching(head)

ring switched(head)

wtr(head)

ring switchdropped(head)

passthrough

ringswitching

(dest.)

ring bridged(dest.)

wtr(dest.)

no request

workingchannel SF

destinationbridged

workingchannel clear

wtr intervalexpires

no request

no request

no request

ring bridge req. (not dest.)

ring bridge req. long path (dest.) ring bridge req.

(short path)

headswitched

wtr bridgerequest

Initialstate

Fig. 8-5 GAPS mechanism: Finite State Machine

When considering bidirectional rings, (OMS SPRing protection), two GAPS entities are needed, one for each direction. Under normal conditions, the working wavebands in the transport plane are used to carry regular traffic. When a network component fails, a switch event occurs and the working wavebands are protected using the protection wavebands. A bidirectional link failure implies the LoL detection in the adjacent optical nodes, which notify the failure to their OCCs in the GMPLS-based control plane (Fig. 8-6). The adjacent OCCs exchange bridge requests.


LOL

OCC OCC

LOL

LoL LoLBridge &

switchBridge &switch

Fig. 8-6 A failure in a bidirectional link is detected by its adjacent nodes.

8.1.2 GAPS LMP extensions definition

We define GAPS as an LMP extension, running in the GMPLS-based control plane of OMS protected networks. In this way, it is avoided the implementation of a new control protocol, which would increase the signaling overhead. Specifically, GAPS relies on the control channel management functionalities provided by LMP protocol. Once a control channel is activated between two adjacent nodes, the LMP Hello messages exchanged can be used to maintain control channel connectivity between the nodes. However, in order to run the GAPS mechanism, the definition of a novel LMP message is required:

GAPS Message <GAPS Message> ::= <Common Header> <GAPS>

This message is used to transmit GAPS information when the LMP adjacency is part of an OMS protected ring. The GAPS message (Table 8-2) contains all the information needed by the GAPS protocol.

Table 8-2 GAPS Object Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Node ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Node ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Request Type | Path | Status | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

From the functional point of view, GAPS agent is located on the top of two LMP agents, LMP east and west respectively. This way, GAPS messages can be sent either through the east control channel or through the west control channel.

8.1.3 Protection time Models for GAPS-controlled OMS protected rings

In this subsection, the models to calculate the protection time for OMS DPRing and OMS SPRing running with the GAPS mechanism are presented. The aim is to find


out the switching time requirements to be imposed to the optical nodes to meet the protection time target.

Before defining the protection time, let us denote tconfig as the configuration time of the optical node, that is, the time to process a request from the OCC or to inform the OCC upon any incidence. Note that tconfig is an end-to-end measure that includes the processing time also at the CCI that is, at the control or management plane and in the optical node, and the node processing time needed to generate the specific commands to the physical devices. We can express thus tconfig as:

nodeCCIconfig ttt (8.1)

Let us define the protection time (tDPRing) in an OMS DPRing, as the interval from the decision to switch to the completion of the switching operation at the node initiating the bridge request. It includes, thus, the notification from the initiating optical node to its OCC (tconfig), the propagation delay in each control network link (tlink), the processing time in each OCC (tOCC), the time to configure each optical node in the ring (tconfig) to perform the switching action and, finally, the time to switch itself (tswitch). Then, tDPRing can be expressed as:

linkOCCswitchconfigDPRing tntnttt )1(2)12(2 (8.2)

tswitch comes predefined by the switching device. We use a switch with a response time below 1ms, which is in line with the devices currently available in the market (see for example [Sercalo, Cube]). tlink depends on the ring link lengths (L) and on the signal speed through the fiber. Note that for metropolitan ring networks, tlink is negligible.

The protection time model for GAPS controlling a bidirectional ring is different than that defined in (7.1) for OMS DPRing. In fact, in the case of OMS SPRing, GAPS has been extended to coordinate both protection actions, one for each direction, to be done by the nodes adjacent to the failure. We assume that configuration actions are executed in the optical node in a serial basis. Let us define the protection time (tSPRing) in an OMS SPRing, as the interval from the decision to switch to the completion of the switch operation at the node initiating the bridge request. In this case, tSPRing can be expressed as:

linkOCC

config

linkOCCswitchconfig

SPRing

tntn

ttntntt

t)1()1(

max

)1(2

(8.3)

The term max(a, b), express the idea of configuration actions which are performed in a serial basis in the optical node. Note that equation (8.3) will give the same values as equation (8.2) when the time to configure the optical node is higher than the time to transport the GAPS message around the ring. This will happen if the number of nodes in the ring is low or the distances between ring nodes are short. In


such cases, the protection time in OMS SPRing, although higher than that of the OMS DPRing, will be lower than the objective.

Fig. 8-7 and Fig. 8-8 show the theoretical protection time for OMS DPRing and for OMS SPRing, as functions of the number of nodes (n) in the ring and for several link lengths. In this analysis, we assume tconfig to be less than 5ms. Thus, we can conclude that, by using the GAPS mechanism, the typical target for the protection time (i.e., 50 ms) is accomplished even in case of rings composed by a large number of nodes. However, it implies strict requirements on the hardware of the optical nodes (e.g., tcontrol and tconfig). The obtained results show the scalability of GAPS when the number of nodes in the ring is increased.

0

25

50

75

4 6 8 10 12 14 16 18 20

Th

eo

reti

ca

lpro

tec

tio

nti

me

(m

s)


T(DPRing)(L=100Km)

T(DPRing)(L=200Km)

T(DPRing)(L=300Km)

Objective

Fig. 8-7 Protection time for OMS DPRings

0

25

50

75

4 6 8 10 12 14 16 18 20


T (SPRing)(L=100Km)

T (SPRing)(L=200Km)

T (SPRing)(L=300Km)

Objective

Th

eore

tic

al p

rote

cti

on

tim

e (m

s)

Fig. 8-8 Protection time for OMS SPRings


8.2 ROADM Design to Support OMS Protection Schemes

Besides the design of the GAPS mechanism, we have designed two new optical nodes able to satisfy the requirements derived in the last section; one for OMS DPRings and another for OMS SPRings. Optical nodes are based on two WSS, one used for adding and the other for dropping the local traffic [SyTz06].

In the OMS DPRing node (Fig. 8-9), two optical power meters (labeled with M in Fig. 8-9) measure the incoming optical power at east and west ports. Two 2x2 optical switches have been added to the WSS components allowing OMS protection. The two pairs of optical Mux/demux are responsible for coupling the WDM-multiplexed bundle with the in-fiber OSC Channel which transports the control plane information. Two 1300 nm optical transponders (λOSC) are used to convert electrical fast Ethernet signal to the optical domain.

The node uses the pilot tone-based failure localization procedure designed in Chapter 7. The optical power meters monitor the incoming optical power levels at west and east inputs, and notify the node controller upon the reception of out of bounds levels. Upon receiving this advertisement, the node controller will send a LoL notification to the OCC. Note that if the link is not affected by a failure optical power must always be received at each end of the link. When considering out-of-fiber signaling, the OSC does not transport the control channel. Nevertheless, the associated optical hardware cannot be eliminated because an optical pilot signal is still necessary to detect that any failure affecting the link has been repaired.

λOSC

East

West

λOSC

Mx

Mx

Dem

xD

emx

M

M

WSS WSS

Fig. 8-9 Optical Nodes design to support OMS DPRing scheme.

The OMS SPRing optical node is also based on WSS, as shown in Fig. 8-10. We use WSS components with a capacity of 40 channels in the C-band. Although the optical node is defined as bidirectional, it is important to highlight that, as in the OMS DPRing optical node, only two WSS are used, keeping low the cost and the


complexity of the node. We have defined waveband B1 as channels 1-20, and B2 as channels 21-40.

Under normal conditions, B1 received from east and B2 received from west are used to transport the traffic, while B2 east and B1 west are use for protection. Band splitter (BS) components are used to divide the WDM bundle into two bands, while splitters (S) are used to separate optical signals or to join both bands. To implement OMS SPRing protection, four 2x2 optical switches are used to choose which (B1, B2) bands transport the traffic and which bands are used for protection. The resulting cost and complexity of the OMS SPRing optical node are not much higher than that of the OMS DPRing optical node; in fact, only two 2x2 optical switches and a set of passive components (splitters and band splitters) are added.

Fig. 8-10 Optical Node design to support OMS SPRing scheme

The functionality depicted in Fig. 8-9 and Fig. 8-10 has been implemented in the optical node presented in Chapter 5, and experimental results are presented in the next section.

8.3 Node implementation and evaluation

In the physical node, each active card in the optical node is equipped with an ARM7 32-bit RISC processor [ARM] running at 100MHz. The Processor Card controls the components in the card and manages the communication with the Master Card. The Master Card communicates through an internal RS422 serial bus with each of the cards in the optical node and through a fast Ethernet interface with the control and management planes. The Master card is based on the UNC90 microcontroller module [UNC90], which is equipped with an ARM9 32-bit RISC processor [ARM] running at 180MHz. The UNC90 module includes, in addition to

West

Mx

Dem

x

M

λOSC

East

Mx

Dem

x

M

λOSC

B1

B2

BS

BS

BS

B1

B2

B1

B2

s

s

s

s

s

s

sWSS WSS

B1+B2


other elements, the RISC processor, 32 MByte SDRAM and 32 MByte Flash Memory. The internal architecture of the optical node is shown in Fig. 8-11.

The OSNL card processor includes an interrupt driven system to allow the CPU to continue processing instructions while some request from the Master Card arrives. In the mean time, the card processor is executing a polling loop to continuously read values of samples from the monitoring register. If the values for the samples read within a configurable detection time, they are considered as out of bound, and the card processor declares a Loss of Light (LoL) condition. This condition has to be signalized to the Master card sending a proprietary message through the serial bus. The Transponder card processor executes commands from the Master Card related with switching on/off the lasers and controlling the electrical switch.

Fig. 8-11 Optical Node internal architecture

The Master processor runs Linux Operating System. A node agent has been developed to manage the whole optical node, interfacing from the cards to/from the control and management plane. The agent listens for incoming data from serial and TCP/UDP ports. When a message through the serial port is received indicating the LoL condition, the application sends a SNMP trap which brings the related information to the OCC in the control plane and/or the NMS in the management plane.

The agent on the Master Card accepts request-response commands using both a XML-based proprietary protocol and the standard SNMP. When a message through a TCP/UDP port is received, the application decodes it and possibly initiates a communication with another card in the optical node through the serial bus.

5ms was specified in the previous section as the maximum for tconfig in order to provide fast protection times (<50ms). This highly stringent requirement imposes a careful review of the subsystems building of the optical node. In order to optimize the system, some bottlenecks have been detected and corrected. One of the more

CardProcessor

RS422 Serial Bus

OSNL

Master Card

Ethernet

Con

sole

M

M

MasterProcessor

λaλb

clie

nt

clie

nt

CardProcessor

2xTp


important is the one related with the TTY device driver architecture in the Linux kernel [CoRu05] (Fig. 8-12). The TTY core takes data from a user that is to be sent to a TTY device. It then passes it to a TTY line discipline driver, to control the flow of data, which then passes it to the TTY driver. The TTY driver converts the data into a format that can be sent to the hardware. The TTY driver is responsible for sending any data received from the hardware to the TTY core when it is received. The TTY core buffers the data in a structure called flip buffer until it is pushed to the user.

Node Agent

Master Card

Serial busLin

ux

Ker

nel

TTY driver

TTY line discipline

TTY Core Flip buffer

Fig. 8-12 Node Agent architecture

Linux considers serial ports as being high latency devices, so when data is received into the flip buffer the TTY core schedules itself to push the data to the user application at some later point in the near future. This behavior introduces an unacceptable delay in the system. To avoid this default high latency in the serial transmission, the Linux kernel was modified to define the serial driver as a low latency driver, immediately pushing the data to the user application, the node agent.

In order to measure the efficiency of the optical node, in this section we experimentally measure tnode specified above. In order to obtain that measure we have performed the experiment depicted in Fig. 8-13. Two laser emitters are connected to the inputs of an OSNL card, which is equipped with two optical power meters and one optical switch. The OSNL card processor is continuously supervising the optical power at the two meters. One output of the optical switch is connected to a digital oscilloscope. OSNL and Master card communicate each other through the serial bus.

When the laser A is switched off (we use another optical switch to obtain a “clean” outage) the OSNL card processor detects the LoL after a configurable detection time. For this kind of experiments trying to measure times, we configure that detection time fas 0 and, therefore, when the OSNL card processor measure a power sample under the threshold, it declares the LoL condition. After the LoL has been detected (to1), the OSNL card processor activates an interruption line with the Master processor. Then, the master processor sends a command (tm1) through


the serial bus (ts1) asking the OSNL about its state. Upon reception of the command, the OSNL processor answers the master with the LoL condition (to2). The message is sent over the serial bus (ts2) and arrives to the master, that decides perform an optical switching (tm2). Therefore, it sends the specific command to the OSNL card (ts3) that creates the set of signals needed to command the optical switch (to3). The switch is physically performed after tswitch (recall that this time is 1ms imposed by the physical device).

OSNL

Ain

Bin Bout

Aout

C

M1

M2

Master

CardProcessor

MasterProcessor

AF 23 7C 8C 01 05 28 43 91 00

OSNL

Master

ts1 ts2 ts3

td: Detecting timets: Serial Bus Timeto: OSNL processing timetm: Master processing time

tm1 tm2

to1 to2 to3

t0

td t_switch

Laser A

Laser B

SerialBus

2*t_node

Fig. 8-13 Node time experiment

The obtained measure is shown in Fig. 8-14. Note that this includes the whole switching process, i.e., it represents 2*tnode +tswitch. Therefore, we can estimate tnode as about 1.5ms. This is an important result which we use in next chapters.

‐45

‐40

‐35

‐30

‐25

‐20

‐15

‐10

0 5 10 15

3.92 ms

ms

Op

tic

al p

ow

er

(dB

m)

Fig. 8-14 Experimental 2*tnode +tswitch time

In the previous chapter we experimentally obtained the part of the configuration time related with the hardware architecture of the optical node. Recall that we


defined tconfig in equation (8.1). In this section we complete this measure including the processing time at the CCI.

To carry out this experimental test, we have set-up a similar experiment than the detailed above, but this time including an OCC connected to the optical node through the CCI. When the master processor at the optical node receives the notification of LoL from the OSNL card, it sends a SNMP trap to the OCC. Upon reception of the trap, the OCC send a XML message to command the optical switch.

Fig. 8-15 shows the measured switching time when the protection decision is taken by the OCC. Note that the measured time includes one configuration time to notify the OCC, another where the OCC send the switching command, and finally the time to switch in the physical component (2*tconfig + tswitch).

-45

-40

-35

-30

-25

-20

-15

-10

0 5 10 15 ms

Op

tic

al p

ow

er

(dB

m)

9.89 ms

Fig. 8-15 Experimental 2*tconfig+tswitch value

Therefore, we estimate tconfig ≈ 4.5ms, better than 5ms which was the requirement for this time previously defined.

8.4 OMS Experimental results

Fig. 8-16 shows the experimental results for the protection time as a function of the number of nodes for OMS DPRing and OMS SPRing. Because of the testing environment tlink is negligible. In order to have enough accuracy, the figures reflect the average value over 10 experiments. From this results and applying (7.1) for OMS DPRing, and (7.2) for OMS SPRing, we found that tcontrol is less than 0.1ms. Recall that tconfig was experimentally measured in is Chapter 5, being lower than 4.5ms. These results are better than that specified in previous sections and thus, the obtained behavior will be also better than that shown in Fig. 8-7 for OMS DPRing and in Fig. 8-8 for OMS SPRing.


Note that the experimental protection times in Fig. 8-16 do not include any propagation time –the tlink terms in (7.1) and (7.2). Therefore, these experimental times have to be incremented with the specific propagation time. For example, in a 12 node OMS DPRing with links of 300km, the term 2(n-1)*tlink represents an additional delay of 33ms. Thus, the protection time in this case would be 11.90+33=44.90 ms. This time is better than the 48.6ms value specified in previous sections.

9

11

13

15

17

19

3 6 9 12 15 18

Exp

erim

enta

l pro

tect

ion

tim

e (m

s)


DPRing

SPRing

Fig. 8-16. Evolution of protection time with the number of nodes

As an example of Fig. 8-16, Fig. 8-17 and Fig. 8-18 show the experimental tDPRing and tSPRing respectively, for the worst scenario considered, that is the one with 18 nodes.

-45

-40

-35

-30

-25

-20

-15

-10

0 5 10 15 20 25 30 35 40 45

time (ms)

Op

tic

al p

ow

er

(dB

m)

12.85 ms

Fig. 8-17. OMS DPRing Protection time for rings with 18 nodes

When the link failure is repaired, optical power is detected again by the adjacent optical nodes. At this moment, the WTR period starts. After the WTR time the


protection is reverted and the signal is switched from the protection to the working links.

Therefore, it is possible to deploy rings with 20 nodes and a total length of 2,000 km, or rings with 16 nodes and 3,200 km, both with a recovery time below 50ms. These results show that our GAPS-based solutions scale linearly with the number of nodes in the ring and with the link length. Thus, we can conclude that GAPS in conjunction with the designed ROADMs provide OMS protection under 50ms in rings with high number of nodes.

-45

-40

-35

-30

-25

-20

-15

-10

0 5 10 15 20 25 30 35 40 45

time (ms)

Op

tica

l po

wer

(d

Bm

)

15.71 ms

Fig. 8-18. OMS SPRing Protection time for rings with 18 nodes

Finally, a comparison of both solutions, assuming W wavelengths/link, is shown in Table 8-3. Although the cost and complexity of the OMS DPRing solution is lower than that of the OMS SPRing, the increment in the ROADM cost due to the OMS support is very low in both cases. OMS SPRing is more bandwidth efficient and provides better availability than OMS DPRing.

Table 8-3 Comparison of OMS solutions

OMS DPRing OMS SPRing

Transported traffic (LSPs)

W independent of the number of nodes (n) in the ring.

From W to Wn/2, depending on the traffic pattern.

Protection time Fastest (<50m) Fast (<50m)

Cost Lowest Low

8.5 Summary

In this chapter two OMS protection solutions (DPRing and SPRing) for GMPLS controlled optical ring networks, based on a novel GAPS mechanism, have been


presented. OMS protection makes possible to recover all optical channels in a fiber with just one protection action. Since the overall protection time increases linearly with the number of nodes, the scalability of the GAPS mechanism has been demonstrated. From the obtained results we conclude that a ring-based network using the designed ROADM nodes and controlled by the GAPS protocol will provide survivability with a SDH-like service recovery time (<50ms), even in large optical rings.

A pay-as-you-grow strategy can be implemented using both schemes: OMS DPRing can be used in networks where expected traffic demand is lower than the number of wavelengths available in each link; if the traffic grows, the migration from OMS DPRing to OMS SPRing consists on adding one OSNL card and another card for the passive components (splitters and band splitters) to every optical nodes in the ring.

3

99

Chapter 9

OMS – SPP real-time mechanism

In Chapter 6 we presented the SPP with extra-traffic scheme. Protecting at the LSP layer in rings provides high availability. However, the recovery time depends upon the number of LSPs to protect, since the protection switching is performed for every single LSP (Chapter 7). To address this issue, we defined two classes of protection with different requirements in terms of protection time. Recall that part of the extra-traffic will be saved when protecting at the path layer.

In Chapter 8 we presented protection schemes at the OMS layer. Protecting at the OMS layer provides the same availability as protecting at the LSP layer, but with providing recovery times under 50ms, when the GAPS mechanism controls the protection at the control plane. OMS protection schemes also support protection resources to be used to transport preemptable extra-traffic. However, in the event of link failure, all extra-traffic is preempted and the protection resources are used to save the protected traffic.

In this chapter we present a mechanism to provide protection times under 50 ms. It chooses the layer (OMS or LSP) in which the protection will be performed as a function of the number of LSPs to protect, thus minimizing the amount of extra-traffic preempted.

9.1 Shared-Path Protection and OMS Shared Protection

The objective of the OMS schemes is to provide protection times under 50ms to the complete set of protected LSPs. We assume OADMs providing switching times of 3.5ms. OMS SPRing also allows using the protection resources to transport extra-


traffic. However, conversely to SPP, in OMS SPRing all extra-traffic is preempted when recovering the protected traffic after a link failure.

OMS SPRing and SPP schemes can coexist in the same ring. In the same way of our proposal for SPP, OMS SPRing divides the total capacity of each fiber in two wavebands: the working and the protection wavebands. Therefore, wavebands can be used to support SPP and OMS SPRing. In order to support the SPP scheme with extra-traffic, in Chapter 7 we designed the OADM block depicted in Fig. 7-1. To also support the OMS SPRing scheme we have added to that OADM block a set of optical switches, splitters, couplers, and band splitters (BS), and the additional hardware to monitor the incoming optical power (Fig. 9-1).

West

M

East

B1

B2

BS

BS

BSB2

B1

B2

B1

s

s

s

s

sBS

B2

B1

s

s

s

B1

B2

B2B1

M

IO

OADMw/SPP

Fig. 9-1 OADM design supporting both SPP and OMS SPRing.

Using the designed OADM supporting both levels of protection, we propose a real-time mechanism to decide which protection scheme is the most appropriate to apply upon the detection of a failure, in base to the number of currently established LSPs (r) to be restored: if r≤11, then SPP can be performed, keeping the protection time under 50ms, while maximizing the total traffic transported by the ring, as some preemptable LSPs can continue working; on the contrary, if r>11 the protection time cannot be kept under the limit, so OMS SPRing is performed instead although all extra-traffic will be preempted.

To illustrate the impact of the protection method chosen over the spare resources, let us assume uniformly distributed traffic to be transported over the ring. We assume also that the shortest route is always used for the working LSP. In the seven-node ring example of Fig. 9-2, a particular link (e.g. 3-4) can transport

Chapter 9 - OMS – SPP real-time mechanism 101

working LSPs of 1 .. n/2 (1, 2, and 3) hops long. Depending on the end nodes, several distinct LSPs for the same hop count may exist as shown in Fig. 9-2. In case of failure, we can calculate the average number of spare data-links (dl) that will be used to protect a single working LSP as the product of the number of hops used for protection and the probability of each distinct of LSP:

2/

1

2/

1

)(

n

i

n

i

i

iindl (9.1)

In our example the average number of data-links used for the protection of one LSP when SPP is applied is 4.67 data-link/LSP. Using OMS SPRing, the number of data-links used for the protection of the link 3-4 is (7-1)x20 = 120, which correspond to the whole network spare capacity.

1 2 3 4 57 6 7

1

7

6 3

2

5 4

1

7

6 3

2

5 4

Fig. 9-2. All distinct LSPs routed through link 3-4.

Fig. 9-3 shows the number of spare data-links that are preempted, in average, as a function of the number of protected connections for different ring sizes (n). When the number of LSPs to protect is low (r≤11) SPP is performed; the spare capacity used grows linearly with the number of LSPs. When the number of LSPs to protect is greater than the threshold (r>11) OMS SPRing has to be applied in order to keep the protecting time within 50ms; in this case, the whole spare capacity is used. Fig. 9-3 also shows, in percentage, the evolution of the spare data-links that continue working after the protection is performed. The right Y axis is used in this case.


Pre

em

pte

d d

ata

-lin

ks

Sa

ved

da

ta-l

ink

s (%

)

0%

25%

50%

75%

100%

0

100

200

300

400

0 5 10 15 20

dl (n=5)dl (n=10)dl (n=15)saved dl (%)

connections to protect (r)

OMS SPSPP

Fig. 9-3. Spare capacity used (dl) against the number of protected connections (r). When r≤11 SPP is applied, or else OMS SPRing is applied.

WorkingBand

ProtectionBand

working LSPs

extra-traffic

Normal conditions After a failure

Fig. 9-4. Spare capacity used against the number of protected connections

In order to minimize the amount of extra-traffic which is preempted after a failure, the order to fill the working and protection bands should be the one shown in Fig. 9-4. Working band should start filling data-links in an ascending way. Recall that if the routing algorithm chooses i for the working LSP of a connection, then the protection LSP will be routed through W-i+1. Thus, in the protection band, the lower is the data-link index the lower is the probability of being preempted after a failure. In the protection band, as a consequence, the same order should be used to assign the wavelength for the extra-traffic.

Chapter 9 - OMS – SPP real-time mechanism 103

9.2 Summary

In this chapter a real-time to choose the layer where the protection is applied in case of link failure has been presented. It provides under 50ms recovery times independently of the number of LSPs to protect.

In case the number of LSPs to be protected is higher than the threshold, the OMS scheme must be applied. In line with this, as a further contribution, a real-time mechanism which, on the basis of the number of protected connections, decides which protection scheme has to be applied (OMS SPRing or SPP) has been defined and evaluated. Both methods can coexist using a new OADM design which supports protection at both layers.

As a final conclusion, on the basis of the experimental results, using SPP, only a portion of the spare capacity is used, whereas using OMS SPRing all spare capacity need to be used.

3

105

Chapter 10

Closing Discussion

10.1 Main Contributions

Three main objectives were proposed at the beginning of this thesis: firstly, we assumed the 50ms figure as the recovery time objective; secondly, several protection mechanisms covering different requirements needed to be provided; finally, we found desirable to choose the protection scheme in real time as a function of the actual state of the network.

Within this thesis, physical optical nodes have been designed and built. To meet the stringent performance requirements, its software and hardware architecture have been optimized. Experimental results proved the efficiency of the node.

Moreover, important modules for the ASON/GMPLS CARISMA network test-bed have been developed. In particular, an algorithm to compute disjoint routes under the wavelength continuity constraint has been presented. Its performance was experimentally compared with a generally accepted RWA algorithm. The configuration time for both the physical node and the node emulator has been experimentally obtained. Finally, two strategies for fault localization have been presented and experimentally compared in terms of localization time.

Protection schemes at both the OMS and the LSP layers have been proposed and experimentally evaluated. At the OMS layer, the GAPS mechanism, which coordinates the protection actions after failures, has been presented. Moreover, new nodes to support OMS protection have been designed. At the LSP layer, the shared path protection with extra-traffic scheme has been implemented. The protection time provided by this scheme has been analyzed as a function of the WSS. We demonstrated that the switching time of the WSS prevents from


protecting the complete set of affected connections within 50 ms after fault detection. This limitation can be solved by traffic differentiation.

Finally, a mechanism to provide protection times under 50 ms to the complete set of connections has been presented. It chooses the layer (OMS or LSP) in which the protection is performed as a function of the number of LSPs to protect.

Therefore, the main objectives of this thesis have been successfully achieved.

10.2 Publications

10.2.1 Journals

[GCOJON09] F. Agraz, L. Velasco, J. Perelló, M. Ruiz, S. Spadaro, J. Comellas, and G. Junyent, “Design and Implementation of a GMPLS-Controlled Grooming-capable Optical Transport Network,” Accepted for publication in OSA Journal of Optical Networking.

[JON09] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent, “Shared-path protection with extra-traffic in ASON/GMPLS ring networks,” OSA Journal of Optical Networking 8, 130-145 (2009).

[COMPNW08] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Introducing OMS protection in GMPLS-based optical ring networks,” Computers Networks 52, 1975-1987 (2008).

10.2.2 Conferences and workshops

[ECOC08] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent, “Wavelength Selective Switches time requirements for shared path protection in ASON/GMPLS rings,” In Proc. ECOC, P.5.13-P.5.14 (2008).

[ICTON08-1] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Real-time OCh-OMS protection scheme selection”, In Proc. of 10th International Conference on Transparent Optical Networks (2008).

[ICTON08-2] J. Perelló, L. Velasco, F. Agraz, S. Spadaro, G. Junyent, and J. Comellas, “A Comparison of In-Fiber and Out-Of-Fiber GMPLS-based Control Plane Configurations: Benefits, Drawbacks and Solutions”, In Proc. of 10th International Conference on Transparent Optical Networks (2008).

Chapter 11 - Closing Discussion 107

[ONDM08-1] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Capacity and availability comparison of OMS protection schemes in ASON/GMPLS mesh networks”, In Proc. ONDM, 189-193 (2008).

[ONDM08-2] L. Velasco, R. Romeral, F. Agraz, S. Spadaro, J. Comellas, G. Junyent, and D. Larrabeiti. “On the design of MPLS-ASON/GMPLS Interconnection Mechanisms”, In Proc. ONDM, 206-211 (2008).

[ECOC07] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Link Management Protocol extensions for OMS protection in GMPLS-based optical ring networks”. 33rd European Conference on Optical Communication, 247-248 (2007).

[ICTON07] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Experimental evaluation of OMS protection in GMPLS-based optical networks”. Proc. of 9th International Conference on Transparent Optical Networks, 193-196 (2007).

[GCOICTON07] S. Spadaro, J. Perelló, E. Escalona, L. Velasco; F. Agraz, J. Comellas, and G. Junyent. “The CARISMA ASON/GMPLS Network: Overview and Open Issues”. Proc. of 9th International Conference on Transparent Optical Networks, 18-21 (2007).

[JTI+D07] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Diseño de nodos totalmente ópticos reconfigurables”. Jornadas Telecom I+D, 1-4 (2007).

[DRCN07] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “ROADM design for OMS-DPRing in GMPLS based optical networks”. In Proc. DRCN, 1-7 (2007).

[WGN07] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Semi-Reconfigurable OADM node design for the CARISMA ASON/GMPLS Network”. VI Workshop in G/MPLS Networks, 97-107 (2007)

[ICTON06] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Failure aware diverse routing: a novel algorithm to improve availability in ASON/GMPLS networks”. Proc. of 8th International Conference on Transparent Optical Networks, 195-198 (2006).

[JTI+D06] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Alta disponibilidad en redes ASON/GMPLS”. XVI Jornadas Telecom I+D, 1-9 (2006).


[ePOSUM06] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Adaptive FAR: Improving availability through the knowledge of connection-holding time”. ePhoton One Summer School on Optical Grid and Optical Network Resilience, C31-C36 (2006).

[WGN06] L. Velasco, S. Spadaro, J. Comellas, and G. Junyent. “Diverse Routing Strategies for On-Demand Lightpath Provisioning in ASON/GMPLS Networks”. V Workshop in G/MPLS Networks, 141-150 (2006).

[WGN05-1] L. Velasco, J. Perelló, and G. Junyent. “Metropolitan Ethernet Networks (MEN)”. IV Workshop in G/MPLS Networks, 185-197 (2005).

[WGN05-2] J. Perelló, L. Velasco, and G. Junyent. “Operation, Administration and Maintenance in MPLS based Ethernet Networks”. IV Workshop in G/MPLS Networks, 199-209 (2005).

10.3 National and European Research Projects

Part of the material presented in this thesis has been developed within the framework of the following National and European research projects:

STREP “Dynamic Impairment Constraint Networking for Transparent Mesh Optical Networks (DICONET)” FP7- 216338, 2008-2010.

VISION “Comunicaciones de Vídeo de Nueva Generación”, CENIT, Programa Ingenio 2010, 2007-2010.

BONE NoE “Building the Future Optical Network in Europe: The e-Photon/ONe Network”, FP7- FP7-216863, 2008-2010.

TRILOGY project, Fundació i2CAT, 2008-2009.

"Red inteligente GMPLS/ASON con integración de nodos reconfigurables" RINGING, TEC2005-08051-C03-02, 2006-2008.

MACHINE project, Fundació i2CAT, 2007-2008.

NoE “Optical Networks: Towards Bandwidth Manageability and Cost Efficiency” (e_Photon/One) – Phase 2, UE-IST FP6-027497, 2006-2008.

EUREKA-CELTIC “Field Trial with Integrated ROADMs and GMPLS compliance” (FIRM), EUREKA-CELTIC C1-028, 2004-2006.

Chapter 11 - Closing Discussion 109

10.4 Topics for Further Research

10.4.1 Dynamically Managed Differentiated Services

The massive use of the Internet and multimedia applications highly increases the traffic to be carried by optical networks. Typically, the Internet traffic is not considered as a critical service and the revenues obtained from carrying this traffic are, in general, low. Nevertheless, business data traffic, requiring much less capacity than Internet traffic, is usually associated with strict SLAs. SLA breaches turn into revenue losses for carriers. In between, carrying telephonic traffic provides regular revenues with less strict service requirements. Therefore, it is important for network operators to implement some classes of service policy to differentiate traffic.

Business traffic, in one extreme, requires high availability. At the other extreme, Internet traffic can be considered as best effort traffic, while telephonic traffic, in turn, can be considered as unprotected traffic at the optical layer.

Regarding this topic we are developing a method to maximize the total served traffic, and thus the operator revenues, in ASON/GMPLS networks. It is based on defining separated sets of resources, called class of service bands, to provide three different classes of service: the protected, the unprotected, and the best effort preemptable classes of service.

The size of the bands can be dynamically modified to increase or decrease the amount of resources assigned to a particular class of service, adapting this way the network to the traffic demand.

10.4.2 MPLS-GMPLS interconnection

MPLS and GMPLS interworking is a very active area for research and standardization. Within the IETF, the CCAMP [CCAMP] is leading a number of studies and extensions of existing protocols in order to achieve interworking between MPLS and ASON/GMPLS networks. [Ju07] defines a general requirement list which any solution should carry out.

In this regard, we propose a mechanism for connecting two or more MPLS islands belonging to the same MPLS domain through one ASON/GMPLS domain. Assuming that MPLS and ASON/GMPLS network are managed by different operators; therefore, we are in a multi-domain scenario. The ASON/GMPLS network provides on-demand GMPLS LSPs to their client MPLS networks. The proposed solution will be evaluated according to the previous requirements.

However, in [Ju07] the recommended solution architecture is based on the Border Peer Model. In this model, the interface between the client (MPLS) and the server (ASON/GMPLS) networks is the Peer Border Node. This node has full topology


visibility of both networks. Routing information is not distributed from one to another network.

On the contrary, since we are in a multi-domain scenario with two different administrative networks, our solution is based on the Overlay Model [RFC-3945]. This model implies a request/response protocol between client and server networks, the UNI [RFC-4208]. The key elements defined in this model are the Edge Node and the Core Node. Edge nodes belong to the MPLS domain and core nodes belong to the ASON/GMPLS domain and are administered by different network operators.

R1

R2

R3R5

R4

R6

OBN1

OBN2

OBN3

OBN4

ASON/GMPLS

MPLS Domain

OXC1

OXC2

OXC3

MPLSIsland 2

MPLSIsland 1

Fig. 10-1. Example of two MPLS islands connected through one ASON/GMPLS domain.

The objective of this research topic is to design novel mechanisms that are needed to connect several MPLS islands through one ASON/GMPLS domain. An example of such a situation is shown in Fig. 10-1, where MPLS LSRs R1, R2 and R3 are in the MPLS island 1, and MPLS LSRs R4, R5 and R6 are in the MPLS island 2. Both MPLS islands are connected to the ASON/GMPLS domain through the Overlay Border Nodes (OBN) OBN1, OBN2, OBN3 and OBN4.

3

111

List of Acronyms

API Application Programming Interface

ASN.1 Abstract Syntax Notation 1

ASON Automatically Switched Optical Networks

BE Best-effort LSP (extra-traffic)

CAPEX Capital Expenditures

CC Connection Controller

CCAMP Common Control and Measurement Plane Working Group

CCI Connection Controller Interface

CD Chromatic Dispersion

CoP Classes of Protection

CP Connection Point

DCN Data Communication Network

DiR Differentiated Reliability

DPP Dedicated Path Protection

DPRing Dedicated Protection Ring

DWDM Dense Wavelength Division Multiplexing

E-NNI External Network-Network Interface

FA Forwarding Adjacency

FCAPS Fault, Configuration, Accounting, Performance, Security

FEC Forward Error Correction

FF First-Fit heuristic

FIT Failures In Time

GbE Gigabit Ethernet


GMPLS Generalized Multiprotocol Label Switching

GoP Grade of Protection

IETF Internet Engineering Task Force

I-NNI Internal Network-Network Interface

IP Internet Protocol

ITU International Telecommunications Union

LMP Link Management Protocol

LRM Link Resource Manager

LSP Label Switched Path

MIB Management Information Base

MPLS Multiprotocol Label Switching

MTU Maximum Transmission Unit

NMI-A Network Management Interface – Control Plane

NMI-T Network Management Interface – Transport Plane

NMS Network Management System

NTP Network Time Protocol

OADM Optical Add-Drop Multiplexer

OAM Operation Administration and Management

OCC Optical Connection Controller

OCh Optical Channel

OID Object Identifier

OMS Optical Multiplex Section

OPEX Operational Expenditures

OSPF Open Shortest Path First

OTN Optical Transport Network

OXC Optical Cross-Connect

PDH Plesiochronous Digital Hierarchy

PM Performance Monitoring

PMD Polarization Mode Dispersion

PSB Path State Block

QoS Quality of Service

RC Routing Controller

ROADM Reconfigurable Optical Add-Drop Multiplexer

References 113

RSVP Resource ReserVation Protocol

SDH Synchronous Digital Hierarchy

SFP Small Form-factor Pluggable transceiver

SLA Service Level Agreement

SNC Sub-Network Connection

SNMP Simple Network Management Protocol

SONET Synchronous Optical Network

SP Shared protected LSP

SPP Shared Path Protection

SPRing Shared Protection Ring

STM Synchronous Transport Module

TCP Termination Connection Point

TCP/IP Transmission Control Protocol / Internet Protocol

TE Traffic Engineering

UNI User Network Interface

UP Unprotected LSP

VoIP Voice over IP

WSS Wavelength Selective Switch

XML Extensible Markup Language

3

115

References

[ApZa04] R. Appelman, Z. Zalevsky, “All-Optical Switching Technologies for Protection Applications,” IEEE Optical Communications 11, S35- S40 (2004).

[Ar00] P. Arijs, B. Van Caenegem, P. Demeester, P. Lagasse, W. Van Parys, and P. Achten, “Design of Ring and Mesh based WDM Transport Networks”, Optical Networks Magazine, 3, 25-40 (2000).

[ArKa03] S. Arakawa, J. Katou, and M. Murata, “Design Method of Logical Topologies with Quality of Reliability in WDM Networks,” Photonic Network Communications, 5, 107–21 (2003).

[ARM] ARM: http://www.arm.com

[Bh99] R. Bhandari, “Survivable Networks: Algorithms for Diverse Routing,” Kluwer Academic Publishers (1999).

[BoKu01] E. Bouillet, K. Kumaran, G. Liu, and I. Saniee, “Wavelength Usage Efficiency versus Recovery Time in Path-Protected DWDM Mesh Networks,” Proc. Optical Fiber Commun. Conf. and Exhibition (OFC), TuG1 (2001).

[Book] Bookham: http://www.bookham.com

[Ca06] X. Cao, V. Anand, and C. Qiao. “Framework for waveband switching in multigranular optical networks part I-multigranular cross-connect architectures”, J. Optical Networking, 5, 1043-1055 (2006).

[CARISMA] CARISMA Project: http://carisma.ccaba.upc.edu.

[CCAMP] CCAMP, http://www.ietf.org/html.charters/ccamp-charter.html.

[ChMy07] P. Cholda, A. Mykkeltveit, B. Helvik, O. Wittner, and A. Jajszczyk, "A survey of resilience differentiation frameworks in communication networks", IEEE Communications Surveys 9, 32-55 (2007).

[Co03] J. Comellas, R. Martinez, J. Prat, V. Sales, and G. Junyent, "Integrated IP/WDM routing in GMPLS-based optical networks", IEEE Network 17, 22-27 (2003).


[CoRu05] J. Corbet, A. Rubini, and G. Kroah-Hartman, “Linux Device Drivers” Third Ed., O’Reilly Media, Sebastopol (2005).

[Cr93] D. Crawford, “Fiber Optic Cable Dig-us, Causes and cures”, Network Reliability (1993).

[Cube] CUBE Optics: fhttp://www.cubeoptics.com

[DoCl03] J. Doucette, M. Clouqueur, and W. D. Grover, “On the availability and capacity requirements of shared backup path-protected mesh networks,” Opt. Netw. Mag., 4, 29–44 (2003).

[DuGr94] D. Dunn, W. Grover, M. MacGregor, “Comparison of k-shortest paths and maximum flow routing for network facility restoration,” IEEE Journal on Selected Areas of Communications 2, 88–99 (1994).

[DwWa00] A. Dwivedi, R. Wagner, Traffic Model for USA Long-Distance Optical Network, in Proc. of Optical Fiber Communication Conference (OFC) 1, TuK1-1 156-158 (2000).

[EsFi05] E. Escalona, S. Figuerola, S. Spadaro, G. Junyent, “Implementation of a Management System for the ASON/GMPLS CARISMA network”, IV Workshop in G/MPLS Networks, 175-183 (2005).

[EsSp05] E. Escalona, S. Spadaro, J. Comellas, G. Junyent, “Establishing Source-Routed Bidirectional Connections over the Unidirectional ASON/GMPLS CARISMA Testbed,” In Proc. of 7th International Conference Transparent Optical Networks (ICTON) 2, 21-24 (2005).

[FuTa06] A. Fumagalli, and M. Tacca, “Differentiated Reliability (DiR) in Wavelength Division Multiplexing Rings”, IEEE/ACM Transactions on Networking, 14, 159-168 (2006).

[G.694.1] “Spectral grids for WDM applications: DWDM frequency grid,” G. 694.1 (ITU-T, 2002).

[G.707] “Network node interface for the synchronous digital hierarchy (SDH),” G.707 (ITU-T, 2003).

[G.709] “Optical Transport Network Interfaces,” G.709 (ITU-T, 2003).

[G.805] "Generic Functional Architecture of Transport Networks," G.805 (ITU-T, 2000).

[G.8080] “Architecture for the Automatically Switched Optical Networks (ASON),” G.8080 (ITU-T, 2001).

[G.841] “Types and characteristics of SDH network protection architectures,” G.841 (ITU-T, 1998).

[G.841] “Types and characteristics of SDH network protection architectures,” G.841 (ITU-T, 1998).

[Ge07] O. Gerstel, and G. Sasaki, "Meeting SLAs by design: a protection scheme with memory", Proc. of the IEEE/OSA Optical Fiber Communications Conference (OFC), paper p.OThJ2 (2007).

References 117

[Go06] Y. Goebuchi, T. Kato, and Y. Kokubun, “Fast and Stable Wavelength-Selective Switch using Double-Series Coupled Dielectric Microring Resonator,” Photonics Technology Letters, 18, 538-540 (2006).

[Go08] Y. Goebuchi, M. Hisada, T. Kato, and Y. Kokubun, “Optical cross-connect circuit using hitless wavelength selective switch,” Optics Express 16, 535-548 (2008).

[Gr04] W. Grover, “Mesh-Based Survivable Networks,” Prentice Hall PTR (2004).

[Gr98] W. D. Grover, and D. Stamatelakis, “Cycle-oriented Distributed Preconfiguration: Ring-like Speed with Mesh-like Capacity for Self planning Network Restoration,” Proc. IEEE International Conference on Communications (ICC) (1998).

[Gu05] L. Guo, H. Yu, and L. Li, “A new shared-path protection algorithm under shared risk link group constraints for survivable WDM mesh networks,” Elsevier Optics Communications 246, 285-295 (2005).

[Gu07] L. Guo, and L. Lemin, “A Novel Survivable Routing Algorithm With Partial Shared-Risk Link Groups (SRLG)-Disjoint Protection Based on Differentiated Reliability Constraints in WDM Optical Mesh Networks”, IEEE J. of Lightwave Technology, 25, 1410-1415 (2007).

[IaCh02] G. Iannaccone, C. Chuah, R. Mortier, S. Bhattacharyya, and C. Diot, “Analysis of link failures in an IP backbone”, in Proc. ACM SIGCOMM IMW’02 (2002).

[Ju07] K. Jumaki, “Interworking Requierements to Support operation of MPLS-TE over GMPLS Networks”, draft-ietf-ccamp-mpls-gmpls-interwork-reqts (work in progress), (2007).

[Li05] M. Li, M. Soulliere, D. Tebben, L. Nederlof, M. Vaughn, and R. Wagner, “Transparent Optical Protection Ring Architectures and Applications”, IEEE J. of Lightwave Technology, 23, 3388-3403 (2005).

[LiKa02] G. Li, C. Kalmanek, B. Doverspike, “Fiber Span Protection in Mesh Optical Networks,” Optical Networks Magazine 3, 21-31 (2002).

[M.3400] “TMN management functions,” M.3400 (ITU-T, 2000).

[Ma03] S. Maesschalck, D. Colle, I. Lievens, M. Pickavet, P. Demeester, C. Mauz, M. Jaeger, R. Inkret, B. Mikac and J. Derkacz, “Pan-European Optical Transport Networks: An Availability-based Comparison,” Photonic Networks Communications 3, 203-225 (2003).

[MaCa08] R. Martínez, R. Casellas, and R. Muñoz, “Experimental evaluation of GMPLS enhanced routing for differentiated survivability in all-optical networks,” OSA J. Opt. Ntw. 7, 496-512 (2008).

[Mu05] R. Muñoz, C. Pinart, R. Martinez, M. Requena, A. Amrani, J. Sorribes, and G. Junyent, “Experimental GMPLS fault management for OULSR transport networks”, Optical Fiber Communications, OFC/NFOEC 3, paper JWA50 (2005).


[NetConf] “Network Configuration (Netconf),” http://www.ietf.org/html.charters/ netconf-charter.html

[Ou04] C. Ou, J. Zhang, H. Zang, L. Sahasrabuddhe, and B. Mukherjee, “New and Improved Approaches for Shared-Path-Protection in WDM Mesh Networks,” IEEE J. of Light. Tech. 22, 1223-1232 (2004).

[PeEs07] J. Perelló, E. Escalona, S. Spadaro, J. Comellas, G. Junyent, “Resource Discovery in ASON/GMPLS Transport Networks”, IEEE Communications Magazine 45, 86-92 (2007).

[PePe04] P. Péloso, D. Penninckx, M. Prunaire, and L. Noirie, “Optical Transparency of a Heterogeneous Pan-European Network”, IEEE J. of Light. Tech. 22, 242-248 (2004).

[QUAGGA] GNU Quagga Routing Software. http://www.quagga.net

[RaSi95] R. Ramaswami, K. Sivarajan, “Routing and Wavelength Assignment in All-Optical Networks,” IEEE Trasactions on Networking 5, 489-500 (1995)

[ReCo06] I. Redpath, D. Cooperson, R. Kline, "Metro WDM Networks Develop an Edge", Optical Fiber Communication Conference (OFC), 8, NThC1 (2006).

[RFC-2205] R. Braden, L. Zhang, S. Berson, S. Herzog, S. Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification," RFC 2205, (IETF, 1997).

[RFC-2210] J. Wroclawski, "The Use of RSVP with IETF Integrated Services," RFC 2210, (IETF, 1997).

[RFC-3209] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, G. Swallow, “RSVP-TE: Extensions to RSVP for LSP Tunnels,” RFC 3209, (IETF, 2001).

[RFC-3411] D. Harrington, R. Presuhn, and B. Wijnen, “An Architecture for Describing Simple Network Management Protocol (SNMP) Management Frameworks,” RFC 3411, (IETF, 2002).

[RFC-3471] L. Berger, “Generalized Multi-Protocol Label Switching (GMPLS) Signaling Functional Description,” RFC 3471 (IETF, 2003).

[RFC-3473] L. Berger, “Generalized Multi-Protocol Label Switching (GMPLS) Signaling Resource ReserVation Protocol-Traffic Engineering (RSVP-TE) Extensions,” RFC 3473 (IETF, 2003).

[RFC-3477] K. Kompella, Y. Rekhter, "Signalling Unnumbered Links in Resource ReSerVation Protocol - Traffic Engineering (RSVP-TE),"RFC 3477 (IETF, 2003).

[RFC-3630] D. Katz, K. Kompella, and D. Yeung, "Traffic Engineering (TE) Extensions to OSPF Version 2,” RFC 3630 (IETF, 2003).

[RFC-3945] E. Mannie, “Generalized Multi-Protocol Label Switching (GMPLS) Architecture,” RFC 3945 (IETF, 2004).

[RFC-4203] K. Kompella, Y. Rekhter, “OSPF Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS),” RFC 4203 (IETF, 2005).

References 119

[RFC-4204] J. Lang, “Link Management Protocol (LMP),” RFC 4204 (IETF, 2005).

[RFC-4206] K. Kompella, Y. Rekhter, "Label Switched Paths (LSP) Hierarchy with Generalized Multi-Protocol Label Switching (GMPLS) Traffic Engineering (TE),” RFC 4206, (IETF, 2005).

[RFC-4208] G. Swallow, J. Drake, H. Ishimatsu, and Y. Rekhter, “Generalized Multiprotocol Label Switching (GMPLS) User-Network Interface (UNI): Resource ReserVation Protocol-Traffic Engineering (RSVP-TE) Support for the Overlay Model,” RFC 4208 (IETF, 2005).

[RFC-4394] D. Fedyk, O. Aboul-Magd, D. Brungard, J. Lang, and D. Papadimitriou, "A Transport Network View of the Link Management Protocol (LMP)", RFC 4394 (IETF, 2006).

[RFC-4397] I. Bryskin, and A. Farrell, "A Lexicography for the Interpretation of Generalized Multiprotocol Label Switching (GMPLS) Terminology within the Context of the ITU-T's Automatically Switched Optical Network (ASON) Architecture", RFC 4397 (IETF, 2006).

[RFC-4872] J.P. Lang, Y. Rekhter, and D. Papadimitriou, “RSVP-TE Extensions in support of End-to-End Generalized Multi-Protocol Label Switching (GMPLS) Recovery,” RFC 4872 (IETF, 2007).

[RFC-4873] L. Berger, I. Bryskin, D. Papadimitriou, and A. Farrel, “GMPLS Segment Recovery,” RFC 4873 (IETF, 2007).

[RoCo08] JP. Roorda, and B. Collings, “Evolution to colorless and directionless ROADM architectures,” OFC/NFOEC, paper NWE2 (2008).

[ScPr03] J. Schonwalder, A. Pras, and J. P. Martin-Flatin, “On the Future of Internet Management Technologies,” IEEE Commun. Mag. 41, 90–97 (2003).

[Sercalo] Sercalo: http://www.sercalo.com

[SVCo05] S. Verbrugge, D. Colle, P. Demeester, R. Huelsermann, and M. Jaeger, “General availability model for multilayer transport networks,” in Proc. of DRCN, 85-92 (2005).

[SyTz06] S. Sygletos, A. Tzanakaki; and I. Tomkos, “Numerical Study of Cascadability Performance of Continuous Spectrum Wavelength Blocker/Selective Switch at 10/40/160 Gb/s”, IEEE Photonics Technology Letters, 18, 2608-2610 (2006).

[TsHu06] J. Tsai, S. Huang, D. Hah, and M. Wu, “1xN2 Wavelength-Selective Switch with two cross-scanning one-axis analog micromirror arrays in a 4-f optical system,” IEEE J. of Light. Tech. 24, 897-903 (2006).

[ToMa05-1] M. Tornatore, G. Maier, and A. Pattavina, “Cost and benefits of survivability in an optical transport network,” Telektronikk 2, (2005).

[ToMa05-2] M. Tornatore, G. Maier, and A. Pattavina, “Availability Design of Optical Transport Networks,” IEEE J. on Sel. Areas in Comm. 23, 1520-1532 (2005).


[ToNe94] M. To and P. Neusy, “Unavailability Analysis of Long-Haul Networks,” IEEE J. on Sel. Areas in Comm. 12, 100-109 (1994).

[UNC90] DIGI UNC90 - Datasheet: http://www.digi.com/pdf/hwref_cc9u.pdf

[VaPiDe04] J-P. Vasseur, M. Pickavet, and P. Demeester, “Network Recovery - Protection and Restoration of Optical, SONET-SDH, IP and MPLS,” Elsevier, San Francisco, (2004).

[X.680] “Information technology – Abstract Syntax Notation One (ASN.1): Specification of basic notation”, X.680 (ITU-T, 2002).

[Wi08] P. Winzer, et al., “100-Gb/s DQPSK Transmission: From Laboratory Experiments to Field Trials”, IEEE J. of Light. Tech. 26, 3388-3402 (2008).

[ZaJu00] H. Zangy, J. P. Juez, and B. Mukherjeey, “A Review of Routing and Wavelength Assignment Approaches for Wavelength-Routed Optical WDM Networks,” Optical Networks Magazine, 1, 47-60 (2000).

[ZhMu04] J. Zhang and B. Mukherjeey, “A Review of Fault Management in WDM Mesh Network: Basic Concepts and Research Challenges,” IEEE Network, 41-48 (2004).

recovery mechanisms in ason/gmpls networks · 2009. 8. 19. · ii l. velasco - recovery mechanisms...

Documents