analysis of trouble tickets issued by apan jp noc

36
Analysis of Trouble Tickets Issued by APAN JP NOC Jin Tanaka [email protected] KDDI APAN NOC Session in Busan, Korea on 27 August 2003

Upload: merry

Post on 09-Jan-2016

68 views

Category:

Documents


7 download

DESCRIPTION

Analysis of Trouble Tickets Issued by APAN JP NOC. Jin Tanaka [email protected] KDDI. APAN NOC Session in Busan, Korea on 27 August 2003. Agenda. Introduction to APAN JP Site NOC Statistics of Trouble Tickets Trouble analysis Equipment in TokyoXP TransPAC - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analysis of Trouble Tickets Issued by APAN JP NOC

Analysis of Trouble TicketsIssued by APAN JP NOC

Jin [email protected]

KDDI

APAN NOC Session in Busan, Koreaon 27 August 2003

Page 2: Analysis of Trouble Tickets Issued by APAN JP NOC

Agenda

• Introduction to APAN JP Site NOC • Statistics of Trouble Tickets• Trouble analysis

– Equipment in TokyoXP– TransPAC

• Characteristics of Our Trouble• Proposal for improving Network Service Level

Page 3: Analysis of Trouble Tickets Issued by APAN JP NOC

APAN JP Site NOC

Page 4: Analysis of Trouble Tickets Issued by APAN JP NOC

APAN JP Site NOC:Location:

– Physically located at the KDDI Otemachi Bldg 12F in Tokyo, and APAN Tokyo XP equipment is installed on the 5F

Staff:– 24×7 Operators standby Operators are charged with additional operations for other networks

• Scientific, Academic, Commercial

Duties:– Opening and closing of Trouble Tickets– Receiving problem reports– Trouble shooting– Development and maintenance of measurement and operation

tools

Page 5: Analysis of Trouble Tickets Issued by APAN JP NOC

Monitoring Environment

KDDICircuit Division

Operation StaffOperation Staff

Open ViewNNM

Mail & Web Client

PhysicalLayer Monitor

KDDIAPANKDDIAPAN

ハブ

ハブ

ハブ

12F

5F

APAN Equipment

• HP Open View works independently in the NOC segment.•The NOC staff is utilizing Mail & Web clients enabling to detect alerts.• Physical Layer Monitor system of KDDI observes circuits. When any alerts are detected,

we can check the same status as KDDI Circuit Division.

APAN JP Site NOC:

NOC

Page 6: Analysis of Trouble Tickets Issued by APAN JP NOC

Statistics of Trouble Tickets

Page 7: Analysis of Trouble Tickets Issued by APAN JP NOC

• Objects– All trouble tickets issued by APAN JP NOC for the

last 12 months (from 2002/Aug ~ 2003/July)– The total of tickets amount to about 200 tickets– Issue-selecting rules

•Trouble

– All the outages on TransPAC are covered. For others, outage of 15 minutes or more are covered.

•Maintenance– All the maintenance works are covered (including such switch-hits over circuit within 1msec.)

Statistics of Trouble Tickets:

Page 8: Analysis of Trouble Tickets Issued by APAN JP NOC

Statistics of Trouble Tickets:

88 82

29

0

10

20

30

40

50

60

70

80

90

Trouble Maintenance Tracking

Number ofTickets

Fig1: Trouble Tickets on Tokyo XP

Trouble Tickets on Tokyo XP

Page 9: Analysis of Trouble Tickets Issued by APAN JP NOC

Number of Tickets

0

2

4

6

8

10

12

14

16

8/2002

9/2002

10/2002

11/2002

12/2002

1/2003

2/2003

3/2003

4/2003

5/2003

6/2003

7/2003

Trouble

Maintenance

Fig2: Number of Monthly Tickets/Maintenance

Statistics of Trouble Tickets:Number of Monthly Tickets for Trouble/Maintenance

Page 10: Analysis of Trouble Tickets Issued by APAN JP NOC

Fig2: Number of Monthly Tickets for Trouble on Circuit/Equipment/Others/Unknown

StatisticsNumber of Monthly Tickets for Trouble

Number of Tickets

0

2

4

6

8

10

12

14

16

8/2002

9/2002

10/2002

11/2002

12/2002

1/2003

2/2003

3/2003

4/2003

5/2003

6/2003

7/2003

Trouble(Total)Trouble(Circuit)Trouble(Equipment)Trouble(Others)Trouble(Unknown)

Page 11: Analysis of Trouble Tickets Issued by APAN JP NOC

Fig3: Number of Monthly Tickets for Maintenance on Circuit/Equipment

Statistics of Trouble Tickets:Number of Monthly Tickets for Maintenance

Number of Tickets

0

2

4

6

8

10

12

14

16

8/2002

9/2002

10/2002

11/2002

12/2002

1/2003

2/2003

3/2003

4/2003

5/2003

6/2003

7/2003

Maintenance(Total)

Maintenance(Circuit)

Maintenance(Equipment)

Page 12: Analysis of Trouble Tickets Issued by APAN JP NOC

223:03:3754:22:17

0:00:00

24:00:00

48:00:00

72:00:00

96:00:00

120:00:00

144:00:00

168:00:00

192:00:00

216:00:00

240:00:00

Total Time[hh:mm:ss]

Trouble Maintenance

Fig4:Time Volume of Trouble/Maintenance of APAN Tokyo XP

Statistics of Trouble Tickets:Total Length of Time of Trouble/Maintenance of APAN Tokyo XP

Page 13: Analysis of Trouble Tickets Issued by APAN JP NOC

Time underService96.83%

Time underMaintenance

0.62%

Time underTrouble2.55%

Fig5: Total Availability of APAN Network

Statistics of Trouble Tickets:

Total Availability of APAN Network

Page 14: Analysis of Trouble Tickets Issued by APAN JP NOC

Results of Trouble Tickets Statistics

• The total numbers of trouble and maintenance almost equal to each other

• The number of tickets varies mainly in response to

circuit trouble and maintenance, which is obvious

especially on TransPAC• Availability of the whole APAN network is 96.83%.

(97.45% when maintenance is excepted from outage)

Page 15: Analysis of Trouble Tickets Issued by APAN JP NOC

Trouble Analysis

Page 16: Analysis of Trouble Tickets Issued by APAN JP NOC

Trouble Analysis:

Fig6: Trouble Tickets Classified by Area

0

5

10

15

20

25

30

35

40

45

Korea Taiwan Philippine Thailand China J apan U.S.A.

Area

Numbet of Tickets

APAN Seoul XP APAN Taiwan PHnet NECTEC CERNET AI3-NAIST AI3-SFCAPAN Genkai XP CRL IMnet KDDI LABS MAFFIN NIG Osaka UNIVQGPOP RIKEN SINET Softopia Tokyo UNIV WIDE TransPAC northTransPAC south Abilene ANL Genuity NISN StarLight

Trouble Tickets Classified by Area

Page 17: Analysis of Trouble Tickets Issued by APAN JP NOC

Fig7: Total Outage Time Classified by Area

0:00:00

24:00:00

48:00:00

72:00:00

96:00:00

Korea Taiwan Philippine Thailand China J apan U.S.A.

Area

Total Time[hh:mm:ss]

APAN Seoul XP APAN Taiwan PHnet NECTEC CERNET AI3- NAISTAI3- SFC APAN Genkai XP CRL IMnet KDDI LABS MAFFINNIG Osaka UNIV QGPOP RIKEN SINET SoftopiaTokyo UNIV WIDE TransPAC north TransPAC south Abilene ANLGenuity NISN StarLight

Trouble Analysis:Total Outage Time Classified by Area

Page 18: Analysis of Trouble Tickets Issued by APAN JP NOC

Fig8: Average Outage Time Classified by Area

0:00:00

2:24:00

4:48:00

7:12:00

9:36:00

12:00:00

14:24:00

Korea Taiwan Philippine Thailand China J apan U.S.A.Area

Average Outage Time[hh:mm:ss]

Korea Taiwan Philippine Thailand China J apan U.S.A.

Trouble Analysis:Average Outage Time Classified by Area

Page 19: Analysis of Trouble Tickets Issued by APAN JP NOC

0

5

10

15

20

25

30

Korea Taiwan Philippine Thailand China J apan U.S.A.

Number of tickets

Int'l circuit to Seoul XP Equipment of PHnetMaintenance at PHnet Equipment of NECTECInt'l circuit to NECTEC Domestic circuit in ChinaEquipment of CERNET Int'l circuit to CERNETDomestic circuit to Genkai XP Equipment of Tokyo XPJ GN circuit in J apan operation mistake at Tokyo XProuting trouble of Tokyo XP Equipment of AbileneEquipment of ANL Equipment of GenuityDomestic circuit in U.S.A about TransPAC north Domestic circuit in U.S.A about TransPAC southEquipment of Indiana Equipment of TransPACInt'l circuit to TransPAC north

Fig9: Number of Trouble Tickets by Trouble-occurring Area

Trouble Analysis:Number of Trouble Tickets by Trouble-occurring Area

Equipment of PHnet

Local circuit in China

Equipment of TokyoXP

Routing trouble of TokyoXP Int’l circuit to TransPAC

Page 20: Analysis of Trouble Tickets Issued by APAN JP NOC

Routing4.55% Operation

mistake1.14%

Maintenance1.14%Unknown

31.82%

Equipment22.73%

Circuit38.64%

Routing Operation mistake Circuit Equipment Unknown Maintenance

Fig10 : Distribution by reason for Amount of Trouble

Trouble Analysis:Distribution by reason for Amount of Troubles

Page 21: Analysis of Trouble Tickets Issued by APAN JP NOC

Routing6.68%

Circuit15.97%

Operation Mistake0.37%

Unknown40.62%

Equipment36.36%

Unknown Equipment Circuit Routing Operation miss

Fig11 : Distribution by Reason for Outage Time

Trouble Analysis:Distribution by Reason for Outage Time

Page 22: Analysis of Trouble Tickets Issued by APAN JP NOC

Equipment Trouble Analysis in TokyoXP

Page 23: Analysis of Trouble Tickets Issued by APAN JP NOC

Equipment Trouble Analysis in TokyoXP:

Fig12: Classification by Vender for TokyoXP

Others1.64%

J uniper16.39%

Cisco32.79%

Foundry49.18%

Classification by Vender for TokyoXP

Page 24: Analysis of Trouble Tickets Issued by APAN JP NOC

Fig13: Classification by Software/Hardware for TokyoXP

Others45.45%

Soft45.45%

Hard9.09%

Equipment Trouble Analysis in TokyoXP:

Classification by Software/Hardware for TokyoXP

Page 25: Analysis of Trouble Tickets Issued by APAN JP NOC

Trouble Analysis on TransPAC

Page 26: Analysis of Trouble Tickets Issued by APAN JP NOC

Trouble Analysis on TransPAC:

0:00:00

2:24:00

4:48:00

7:12:00

9:36:00

12:00:00

14:24:00

16:48:00

19:12:00

Total Time[hh:mm:ss]

Time 15:01:40 16:49:00

Northern link Southern link0

2

4

6

8

10

12

14

16

18

20

Numbet of Tickets

Tickets 20 5

Northern link Southern link

Fig14: Tickets Volume on Northern/Southern links Fig15: Total Outage Time on Northern/Southern links

Page 27: Analysis of Trouble Tickets Issued by APAN JP NOC

0

2

4

6

8

10

12

14

16

18

20

Numbet of Tickets

Equipment 3 1

Circuit 17 4

Northern link Southern link0:00:00

2:24:00

4:48:00

7:12:00

9:36:00

12:00:00

14:24:00

16:48:00

19:12:00

NTotal Time[hh:mm:ss]

Equipment 7:34:00 12:00:00

Circuit 7:27:40 4:49:00

Northern link Southern link

Fig16:  Ticket Volume on TransPAC links Classified by Circuit/Equipment

Fig17: Total Outage Time on TransPAC links Classified by Circuit/Equipment

Trouble Analysis on TransPAC:

Page 28: Analysis of Trouble Tickets Issued by APAN JP NOC

0

2

4

6

8

10

12

14

16

18

Numbet of Tickets

Unknown 2

Submarine cable 1

US local 12 3

J apan local 2 1

Northern link Southern link0:00:00

1:12:00

2:24:00

3:36:00

4:48:00

6:00:00

7:12:00

8:24:00

Total Time[hh:mm:ss]

Unknown 0:00:03

Submarine cable 0:00:06

US local 7:27:29 3:59:00

J apan local 0:00:02 0:50:00

Northern link Southern link

Fig18: Ticket Volume of Circuit Troubles on TransPAC links Classified by reason

Fig19: Time Volume of Circuit Troubles on TransPAC links Classified by reason

Trouble Analysis on TransPAC:

Page 29: Analysis of Trouble Tickets Issued by APAN JP NOC

0:00:00

2:24:00

4:48:00

7:12:00

9:36:00

12:00:00

Total Time[hh:mm:ss]

Unknown

StarLight 12:00:00

TransPAC and TokyoXP 6:19:00

TokyoXP 1:15:00

Northern link Southern link0

0.5

1

1.5

2

2.5

3

Numbet of Tickets

Unknown

StarLight 1

TransPAC and TokyoXP 2

TokyoXP 1

Northern link Southern link

Fig20: Ticket Volume of Equipment Troubles on TransPAC links Classified by reason

Fig21: Time Volume of Equipment Troubles on TransPAC links Classified by reason

Trouble Analysis on TransPAC:

Page 30: Analysis of Trouble Tickets Issued by APAN JP NOC

Timeunder

Service99.80732

%

Timeunder

Trouble0.19197%

TimeunderMainte

0.000710%

Time under Trouble Time under MaintenamceTime under Service

Trouble Analysis on TransPAC:

Timeunder

Service99.81942%

Timeunder

Trouble0.17155%

TimeunderMainte

0.009028%

Time under Trouble Time under MaintenamceTime under Service

• Northern link Availability = 99.819422% (Including trouble and maintenance)

• Southern link Availability= 99.807319% (Including trouble and maintenance)

• Total Availability = 100 - ( (100 - 99.819422) * (100 - 99.807319) ) = 99.999652%• Redundancy is achieved by the northern and southern links• Fortunately we have no outage at the same time!

Availability of TransPAC

Fig22: Availability of TransPAC Northern link Fig23: Availability of TransPAC Southern link

Page 31: Analysis of Trouble Tickets Issued by APAN JP NOC

Characteristics of Our Trouble

Page 32: Analysis of Trouble Tickets Issued by APAN JP NOC

Characteristics of Our Trouble:

• Average outage time per trouble

• Longest outage time per trouble

MinutesNumber

of Tickets

0< 18

10< 13

30< 16

60< 19

120< 10

240< 4

480< 5

960< 3

Total 88

20.45%

14.77%18.18%21.59%

11.36%

4.55% 5.68% 3.41%

0<

10<

30<

60<

120<

240<

480<

960<

02468

101214161820

0< 10< 30< 60< 120< 240< 480< 960<

Minutes

Number ofTickets

Table1: APAN Network Outages Fig22: APAN Network Outages

Fig23: Distribution of APAN Network Outages by Length of Time

2:32:09

34:45:00

Minutes

Page 33: Analysis of Trouble Tickets Issued by APAN JP NOC

Characteristics of Our Trouble:

• 70% of all the troubles are cleared up within 60 minutes• Equipment troubles are noticeable, causing long outage time in many cases.

– Utilizing housing sites and cooperation with venders are important

• Domestic troubles are noticeable, but the average outage time is shortSharing trouble information internationally is defficult (Time zone, language)

• Trouble occurring on lower layers such as Layer1(circuit) and Layer2(Ethernet switch) are noticeable.

• Having redundant circuits and equipment, as seen on the TransPAC network, will be useful for shortening outage time.

Page 34: Analysis of Trouble Tickets Issued by APAN JP NOC

Proposal for Improving Network Service Level

Page 35: Analysis of Trouble Tickets Issued by APAN JP NOC

• Shortening of trouble-handling time– Start trouble-handling and announce the information quickly

• Operation tools which enabling us to issue trouble tickets automatically and announce information quickly.

– Shorten trouble-shooting time• Remote trouble-shooting from other areas ( cf. Router Proxy on Global NOC)

– These are under examination in TokyoXP

• World Wide Information sharing– Installation of a shared information server Providing the following information

• Performance and Operation status of the whole APAN network (cf. Animated Traffic map on Global NOC)• Trouble and Maintenance information • Syslog of routers in XPs and APs ※ It is desirable that such a server should be installed on a commercial ISP, distant from the APAN networks.

Proposal for Improving Network Service Level:

Page 36: Analysis of Trouble Tickets Issued by APAN JP NOC

• Redundant Network configuration– TransPAC links shows redundant configuration is very effective

in realizing high availability. It is desirable that we establish redundant configuration as much as possible.

• Monitoring of lower layers – For the operation of worldwide networks, it is very important to c

heck the status of international circuits in cooperation with circuit carriers.

– Possibility of using new Ethernet technologies eg,

• BNDP – Bridge Neighbor Discovery Protocol• LFS - Link Fault Signaling (10GbE: 802.3ae)

Proposal for Improving Network Service Level: