05- handling bss alarm monitoring exercise

Download 05- Handling Bss Alarm Monitoring Exercise

If you can't read please download the document

Upload: amarkonde

Post on 08-Apr-2015

188 views

Category:

Documents


4 download

TRANSCRIPT

NOCSUR BSS Alarm monitoringExercises

CT6870en Version 1.0

Nokia Oyj

1 (14)

Alarm monitoring

The information in this document is subject to change without notice and describes only the product defined in the introduction of this documentation. This document is intended for the use of Nokia's customers only for the purposes of the agreement under which the document is submitted, and no part of it may be reproduced or transmitted in any form or means without the prior written permission of Nokia. The document has been prepared to be used by professional and properly trained personnel, and the customer assumes full responsibility when using it. Nokia welcomes customer comments as part of the process of continuous development and improvement of the documentation. The information or statements given in this document concerning the suitability, capacity, or performance of the mentioned hardware or software products cannot be considered binding but shall be defined in the agreement made between Nokia and the customer. However, Nokia has made all reasonable efforts to ensure that the instructions contained in the document are adequate and free of material errors and omissions. Nokia will, if necessary, explain issues which may not be covered by the document. Nokia's liability for any errors in the document is limited to the documentary correction of errors. NOKIA WILL NOT BE RESPONSIBLE IN ANY EVENT FOR ERRORS IN THIS DOCUMENT OR FOR ANY DAMAGES, INCIDENTAL OR CONSEQUENTIAL (INCLUDING MONETARY LOSSES), that might arise from the use of this document or the information in it. This document and the product it describes are considered protected by copyright according to the applicable laws. NOKIA logo is a registered trademark of Nokia Oyj. Other product names mentioned in this document may be trademarks of their respective companies, and they are mentioned for identification purposes only. Copyright Nokia Oyj 2006. All rights reserved.

2 (14) CT6870en Version 1.0

Nokia Oyj

Contents

Contents1 2 2.1 2.2 2.3 2.4 3 3.1 3.2 3.3 3.3.1 3.4 Introduction ............................................................................................4 Monitoring network alarms ...................................................................5 Purpose of the exercise............................................................................5 Using the Alarm Monitor...........................................................................5 Handling maintenance regions.................................................................7 Handling correlated alarms (optional NMS feature) .................................8 Identifying and analysing fault situations............................................9 Purpose of the exercise............................................................................9 Alarm handling process............................................................................9 Handling associated alarms ...................................................................13 Correlating alarms (optional NMS feature).............................................13 Developing corrective course of action ..................................................13

CT6870en Version 1.0

Nokia Oyj

3 (14)

Alarm monitoring

1

IntroductionThe aim of this module is to give the participant the skills needed for network monitoring. Topics to be covered in this module include identifying alarm situations, analysing alarm information, determining the effect on the system, and concluding a course of action. After completing this module, the participant should be able to:

Demonstrate the ability to configure the alarm monitor in order to monitor the alarms active and being generated from either one or a set of network elements. Demonstrate the ability to monitor incoming alarms, identify the affected element, severity and impact on the network. Then using the NMS Alarm Monitor Tool, acknowledge the alarm, then either complete or suggest a corrective course of action. In addition, distinguish the course action to take, should the alarm be correlated.

4 (14) CT6870en Version 1.0

Nokia Oyj

Monitoring network alarms

22.1

Monitoring network alarmsPurpose of the exerciseThe aim of this exercise is to act as review of how to use the NMS Alarm Monitor Tool. The monitor tool is used to detect new faults occurring in the network and to keep a track of active alarms in the network. Therefore, it is necessary that the participants have the necessary skills on how to use the tool. The final object is that the monitor should be configured to monitor different maintenance regions, which will be used in later exercises.

2.2

Using the Alarm MonitorThe Alarm Monitor presents the active alarm situation of the network in the form of a list that is sorted by alarm class and time of the alarm. The list is updated automatically by the system at a time interval that can be set by the user. 1. Start the Alarm Monitor from Top-level User Interface.

Solution:

2.

Choose NOT to group the alarms by their class (***,**, etc.). How are they organised within the window?

Solution:

Results--- Tick one box ():

by their COLOUR? by ALARM NUMBER? by TIME and DATE (alarms are not sorted)? by ACKNOWLEDGEMENT? by TIME and DATE (most recent alarm at the top)?

CT6870en Version 1.0

Nokia Oyj

5 (14)

Alarm monitoring

3.

Select a single alarm and check the functions of the tool bar with six different commands?Tool Bar Description

4.Result:

How would you configure the monitor to update the alarm display every 20 seconds?

6 (14) CT6870en Version 1.0

Nokia Oyj

Monitoring network alarms

2.3

Handling maintenance regionsNetwork elements are grouped together into maintenance regions. Grouping elements in this manner can make it easier to distribute the task of monitoring. Ask the trainer for more information on the maintenance region that you should use. Note You are not expected to assign network elements to a maintenance region. This is a system administration tasks and involves using the Network Editor to create a MR (child of PLMN) object and assign network elements to it. 1. Check the Monitoring Criteria you are using.

Solution:

Are your alarms organised by:

Maintenance Regions2.Result:

Managed Objects

Define Monitoring Criteria as DEFAULT.

3.

List the elements (only the parents) that are part of the maintenance region.

4.Result:

How many currently active alarms are there in the maintenance region?

CT6870en Version 1.0

Nokia Oyj

7 (14)

Alarm monitoring

5.Result:

How many alarms arrived in the last hour and are still active?

6.Result:

How many *** alarms arrived during the last day and are still active?

2.4

Handling correlated alarms (optional NMS feature)When faults occur in the Nokia network, several alarms are generated by different network elements. There is an optional feature that detects the presence of alarms and using of pre-defined rules will reduce the number of alarms seen by the operator, thus making the true fault situation clearer. 1. It is possible to indicate correlated alarms in the Alarm Monitor. If this indicator is enabled, you can see a + to the left of the alarm. How would you ENABLE the correlation indicator?

Solution:

Result:

If the alarms are correlated, how do you get an explanation of the correlation rule used? (T9, optional feature)Solution:

8 (14) CT6870en Version 1.0

Nokia Oyj

Identifying and analysing fault situations

3

Identifying and analysing fault situationsPurpose of the exerciseWhen fault situations occur in the network, a workflow has to be followed. The workflow used in the exercises is a generic model and is used to highlight the importance of following a working process.

3.1

3.2

Alarm handling processThe following flow chart is a generic model on how fault situations can be handled. First, identify the alarm, location and acknowledge associated alarms. The final part is to assess the effect on service.

Alarm active

Alarm shown in NMS Acknowledge alarm Assess effect on service

Can NOC Fix FaultUnable to solve in time limit yes

no

Situation Escalated

Determine Causefixed

Fix Fault

Reporting

Network cancels alarm once fault is fixed

Note The trainer will give directions on what alarms should be handled. This may be already active alarms, or depending on the environment, let the trainer cause a fault situation.

CT6870en Version 1.0

Nokia Oyj

9 (14)

Alarm monitoring

Identify alarm situation and classify fault

1.Solution:

Configure the alarm monitor in a way that only not acknowledged alarms are seen. How did you achieve this and what was the impact?

2.Solution:

Select the alarm entry with the highest classification of fault. How do you obtain more information? Also, what is the class of the alarm?

Identify that alarm is being handled

3.Solution:

As you are now working on the alarm, you should acknowledge it to identify to others that the alarm is being handled. How did you do this?

The alarm will disappear from the monitor as the mode is set to unacknowledged alarms. Therefore, either change the mode of the alarm monitor to show all active alarms or use the alarm history.

Identify cause and location of fault

4.Element Parent Location

Identify the element which generates the alarm. Also, identify its parent and geographical location.

10 (14) CT6870en Version 1.0

Nokia Oyj

Identifying and analysing fault situations

5.

To gain more information on the alarm, the manual can help. Therefore, using the manual, briefly identify the possible cause of the alarm and the cancellation instructions. In addition, is there any supplementary information that is useful in determining the location and cause?

Meaning

Cancel

Assess impact on service

6.

Depending on the type of alarm, identify the state of the network. Should the fault be with a base station, identify the state of the site using remote MML commands. The most commonly used MML is ZEEI:BCF=site_number. The below figure describes how to interpret the information.

The Operational State The BCF Status (Site) Radio FrequencyDX 200 BSC4-KUTOJA 1997-10-19 RADIO NETWORK CONFIGURATION IN BSC: 15:54:59

LapD State

LAC ===

CI ==

B C ADM OP S D-CHANN BUSY STA STATE ARFCN ET-PCM BCCH/CBCH U NAME ST HR FR === ====== ===== ====== =========== = ===== == === === U U L U U U U WO WO BL-USR WO WO WO WO 0 BCF01 WO 0 10 13 16 34 MBCCHC 34 MBCCHC 34 MBCCHC 0 0 0 0 0 0 0 0

BCF-001 C1755DF21INDOOR BTS-001 01755 05500 TRX-001 C1755DF21INDOR2 BTS-002 01755 05501 TRX-002 C1755DF21INDOR3 BTS-003 01755 05502 TRX-003

3 Sectorised Site (3 BTS)

Administrative State

CT6870en Version 1.0

Nokia Oyj

11 (14)

Alarm monitoring

Using the figure, identify the state of the affected BTS.State of the LapD: Are any sites working? What is the effect for the subscribers?

Should the problem be a signalling link or a transcoder, determine the working state of the BSS. Using the ZNLI MML command, the following states can be seen:The state of the signaling link. It should be in AV-EX. Any other state would indicate a problem in the link

DX 200

BSC1-KUTOJA

1997-10-23

17:23:12

SIGNALLING LINK STATES TERM UNIT TERM FUNCT -----------------BCSU-1 0 0 BCSU-1 0 1 LINK STATE ----AV-EX AV-EX

LINK ---0 1

LINK SET -------16 MSC01 16 MSC01

INFO ----

COMMAND EXECUTED

All the signaling links between two network elements are grouped together in a set. The signaling link number. In this case, there are two signaling links between the BSC and MSC, which are connected via different TCSMs.

Is there a working signalling link between the BSC and MSC?

Yes

No

If no, what is the effect? How much of the network is affected?

12 (14) CT6870en Version 1.0

Nokia Oyj

Identifying and analysing fault situations

3.3

Handling associated alarmsWhen faults occur in a network such as disconnected BTSs or broken signalling links, several alarms will occur almost simultaneously. As you have now identified the location of the fault, and before completing a course of action (either fixing the fault or escalating it to the another group) it may be necessary to acknowledge all the alarms related to the fault situation. If you are using alarm correlation (that is, combining and/or reducing many alarms to a few), you will be able to see the situation clearer. 1. Go through the alarm monitor and acknowledge those alarms that are related to the fault situation you are dealing with. If you are not sure whether the alarm is related or not, then leave it. How many alarms did you acknowledge and how did you acknowledge multiple alarms?

Solution:

3.3.1

Correlating alarms (optional NMS feature)1. If you are using the NMS correlation feature, acknowledge those alarms that are generated due to the fault. In addition, check that the alarms in the correlation rule are also acknowledged.

Solution:

3.4

Developing corrective course of actionYou should now have the location of the fault, identified the effect on service and maybe the possible reason for the cause of the fault. Depending on the nature of the fault and the your responsibilities, you are encouraged to make a decision about the next step. Note The student is not expected to be troubleshooting network problems at this stage in the course. Troubleshooting techniques are, therefore, not covered at this point.

CT6870en Version 1.0

Nokia Oyj

13 (14)

Alarm monitoring

2.

The below flowchart helps answering the question, whether NOC can fix the fault. The first stage is actually determining the reason for the fault. However, in some cases the reason is quite straightforward. Can the fault be fixed at the NMS by NOC? If not, it has to escalate. If the fault is to be handled by NOC, the fault is handled. How would you proceed in solving the problem?

Identified location and assessed effect on service

Is reason for fault known? no

yes

Can this be fixed at the NMS by NOC? no

yes

Is NOC expected to solve fault? yes

no

Complete any logging that is needed (eg Trouble ticket)

Handle Fault Situation*

Escalate to another group/individual/Nokia

*- This is part of Fault Handling and is not covered in this module

Solution:

14 (14) CT6870en Version 1.0

Nokia Oyj