netop system administrator guide

25
NetOp System Administrator Guide SYSTEM ADMINISTRATOR GUIDE 1543-APR 901 0382 Uen B1

Upload: eliuda-ch-kurapa

Post on 28-Apr-2015

110 views

Category:

Documents


3 download

DESCRIPTION

NetOp System Administrator Guide

TRANSCRIPT

NetOp System Administrator GuideSYSTEM ADMINISTRATOR GUIDE

1543-APR 901 0382 Uen B1

Copyright Copyright Ericsson (LMI) 2009. All rights reserved. Disclaimer No part of this document may be reproduced in any form without the written permission of the copyright owner. The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.

1543-APR 901 0382 Uen B1 | 2009-10-16

Contents

Contents1 1.1 1.2 1.3 1.4 2 3 3.1 3.2 3.3 3.4 4 4.1 4.2 Glossary Reference List About This Document Purpose Target Groups Prerequisites Typographic Conventions Onlining NetOp Managed Component Collecting NetOp EMS Diagnostic Information Collecting NetOp EMS Systems Diagnostics using NetOpDiag.ksh Generating a NetOp EMS System Health Report using NetOp.ksh Generating a NetOp EMS Node Connection Status Report using NetOp.ksh Node Connection Failure Messages Managing NetOp Node Proxies Displaying the Current NetOp EMS Hierarchy Reconfiguring the Current Node Proxy Version 1 1 1 1 2 3 5 5 7 9 12 15 15 16 19 21

1543-APR 901 0382 Uen B1 | 2009-10-16

NetOp System Administrator Guide

1543-APR 901 0382 Uen B1 | 2009-10-16

About This Document

1

About This DocumentThis chapter contains the following parts: Purpose Target groups Prerequisites Typographic conventions

1.1

PurposeThis document describes troubleshooting information for operational problems encountered with NetOp Element Management System (EMS).

1.2

Target GroupsThe intended target groups for this document are the following: Alarm Operator Radio Network Engineer System Administrator Installation Engineer Field Technician Business Manager Switching Engineer

1.3

PrerequisitesReaders of this document must have authority to perform the tasks listed. It is also assumed that the readers of this document are familiar with the following: OSS-RC Solaris Windows-based applications Sybase

1543-APR 901 0382 Uen B1 | 2009-10-16

1

NetOp System Administrator Guide

1.4

Typographic ConventionsThe typographic conventions for all Customer Product Information (CPI) in OSS-RC are found in OSS Library Typographic Conventions Reference [2].

2

1543-APR 901 0382 Uen B1 | 2009-10-16

Onlining NetOp Managed Component

2

Onlining NetOp Managed ComponentNote: Following installation, the NetOp Managed Component (MC) is offline. For information on onlining and offlining the NetOp MC, see Self Management User Guide Reference [3].

NetOp MC is offline by default. This ensures that NetOp does not consume resources on deployments which do not manage redback nodes. Note: If the OSS server is restarted, the NetOp MC is again offlined and must be reonlined.

1543-APR 901 0382 Uen B1 | 2009-10-16

3

NetOp System Administrator Guide

4

1543-APR 901 0382 Uen B1 | 2009-10-16

Collecting NetOp EMS Diagnostic Information

3

Collecting NetOp EMS Diagnostic InformationThis section describes troubleshooting tools to be used in the event that operational problems are encountered with NetOp Element Management System (EMS). The two scripts provided are NetOp.ksh and NetOpDiag.ksh. The NetOp.ksh script provides:

0 0

A system health report, when run with the status keyword, see Section 3.2 on page 7. A node connection report, when run with the diagnostics keyword, see Section 3.3 on page 9. Table 1, provides further information on errors detected by the node connection report, and steps for resolving these errors.

Both the system health report and node connection report are intended for use by Customers when troubleshooting. The NetOpDiag.ksh script collects system diagnostic data in a tar file. The diagnostic data collected is intended for use by Ericsson Technical Support and should be forwarded in the event of ongoing operational problems.

In order to connect to the NetOp database, both the NetOpDiag.ksh and NetOp.ksh scripts require users to provide valid values for the following arguments: dbusername The user name for connecting to the Sybase database. The parameter dbusername is always netop. dbpassword The password for connecting to the Sybase database. This document provides examples of how to run the NetOpDiag.ksh and NetOp.ksh scripts. In these examples, the parameter dbpassword is given the value passwd. This is for illustrative purposes only. The correct dbpassword value can be obtained from Telecom Security Services (TSS) using pwAdmin, see TSS, System Administrator Guide, Reference [4]. dbspec An argument in the form jdbc:sybase:Tds:masterdataservi ce:5025\?user=netop\&password=passwd

3.1

Collecting NetOp EMS Systems Diagnostics using NetOpDiag.kshIf operational problems that require troubleshooting are encountered, the NetOpDiag.ksh script can be run to take a snapshot of the entire system. The NetOpDiag.ksh script is run using the following syntax:

1543-APR 901 0382 Uen B1 | 2009-10-16

5

NetOp System Administrator Guide

./NetOpDiag.ksh -u dbusername -p dbpassword -d dbspec archive-dir This creates the following two output files: emsDiagyyyy-mm-dd_mm-hh-ss.tar.gz emsDiagyyyy-mm-dd_mm-hh-ss.log The ./NetOpDiag.ksh script requires a value for the archive-dir argument and this defines the directory where the .tar file, the system diagnostics data, and log data for the script is stored. This can be an absolute or relative path. This directory must have at least 1GB of free space.

Note:

To collect system diagnostics with the NetOpDiag.ksh script, perform the following steps: 1. Log on to the NetOp EMS server host as nmsadm. 2. Open a terminal window and navigate to the /opt/ericsson/nms_netop/Net Op/0.0.0.0 directory. 3. Run the NetOpDiag.ksh script according to the following syntax: ./NetOpDiag.ksh -u dbusername -p dbpassword -d dbspec archive-dir The following information is collected and stored in the .tar file: NetOp EMS log files Node proxy log files NetOp EMS thread output Node proxy thread output Connection status and history tool output (requires the optional database arguments). NetOp EMS system health tool output:

0 0

NetOp EMS state output Node proxy state output

The following is a sample request for NetOp EMS diagnostics information to be saved in /tmp: ./NetOpDiag.ksh -u netop -p passwd -d jdbc:sybase:Tds:mast erdataservice:5025\?user=netop\&password=passwd /tmp As the request is processing, the following information appears in the terminal window: Using directory /tmp

6

1543-APR 901 0382 Uen B1 | 2009-10-16

Collecting NetOp EMS Diagnostic Information

This script will log data to /tmp/emsDiagLog2009-02-20_1 5-32-10.log. Verifying NFS server netop240-14 Triggering thread dump for EMS server java processes. Note that this script requires the EMS server to have been started through the NetOp.ksh script EMS Server Thread Dump for PID(s): 27654 Triggering thread dump for EMS Proxy java processes. Proxy Thread Dump for PID(s): 29920 Proxy Thread Dump is between bytes 2678313 and 2746756 in proxy-SER-6.1.4.1-Proxy1.log Also attempted to capture this thread dump to proxy_Proxy1_threaddump.txt Proxy Thread Dump for PID(s): 29919 Proxy Thread Dump is between bytes 91441 and 146854 in proxy-SER-6.1.4.1-Proxy2.log Also attempted to capture this thread dump to proxy_Proxy2_threaddump.txt Collecting System Health Status Collecting Connection History The EMS connection history data has been saved to /tmp/connHistory.log (this file is also included in the archive) Archiving EMS server directory at NetOpSrv/0.0.0.0/log Archiving Proxy directory /opt/ericsson/nms_netop/NetOpPro xy/0.0.0.0_SER_6.1.4.1 Archiving NetOp directory at NetOp/0.0.0.0/log The troubleshooting archive has been saved to /tmp/emsDiag2009-02-20_15-32-10.tar.gz The log file for this tool has been saved to /tmp/emsDiag Log2009-02-20_15-32-10.log (this file is also included in the archive)

3.2

Generating a NetOp EMS System Health Report using NetOp.kshThe NetOp EMS system health report provides data about the processes associated with the database and the proxy servers that run on a single NetOp EMS server. The report is written to standard output (stdout) and displayed in the terminal window. The report is also saved in a file with the name NetOpStatus_yyyy-mm-dd_mmhh-ss in the directory/opt/ericsson/nms_netop/NetOp/0.0.0.0/log. To generate a NetOp EMS system health report, perform the following steps: 1. Log on to the NetOp EMS server host as nmsadm. 2. Open a terminal window and navigate to the /opt/ericsson/nms_netop/Net Op/0.0.0.0 directory.

1543-APR 901 0382 Uen B1 | 2009-10-16

7

NetOp System Administrator Guide

3. Run the NetOp.ksh script as follows: ./NetOp.ksh -u dbusername -p dbpassword -d dbspec status The resulting NetOp EMS system health report includes the following data about each process: Process ID Process description Process status (running or not running) Version information

The following is a sample request for a system health report: ./NetOp.ksh -u netop -p passwd -d jdbc:sybase:Tds:masterda taservice:5025\?user=netop\&password=passwd status The following is a sample system health report that appears in the terminal window: Output redirected to /opt/ericsson/nms_netop/NetOp/0.0.0.0/ log/NetOpStatus_2009-02-20_15-30-47.log --------------------------------------------Redback Networks NetOp EMS System Health Tool --------------------------------------------Executed at Fri Feb 20 15:30:48 PST 2009 System Info Name:................netop240-14 Hardware:............sun4u OS:..................SunOS 5.10 Generic_127127-11 Database Server URL:.................masterdataservice Location:............Local Connection status:...Successful Process Status:......NOT Running PID:.................none EMS Server: EmsServer NetOp Release:.......0.0.0.0 Status:..............Running PID:.................27654 Proxy: Proxy1 NetOp Release:.......0.0.0.0 Proxy Release:.......0.0.0.0_SER_6.1.4.1 Status:..............Running PID:.................29920

8

1543-APR 901 0382 Uen B1 | 2009-10-16

Collecting NetOp EMS Diagnostic Information

Proxy: Proxy2 NetOp Release:.......0.0.0.0 Proxy Release:.......0.0.0.0_SER_6.1.4.1 Status:..............Running PID:.................29919 [Warning] Unknown command line ids: EmsServer

3.3

Generating a NetOp EMS Node Connection Status Report using NetOp.kshThe NetOp EMS node connection status report provides connection and identification data for all managed nodes. The report is written to standard output (stdout) and displayed in the terminal window. Note: Unlike the NetOp EMS System health report, this node connection status report is not saved as a log file. The output (stdout) can be redirected to a file using standard UNIX redirection commands. This is advisable when creating reports for large number of nodes.

To generate a NetOp EMS node connection status report, perform the following steps: 1. Log on to the NetOp EMS server host as nmsadm. 2. Open a terminal window and navigate to the /opt/ericsson/nms_netop/Net Op/0.0.0.0 directory. 3. Run the NetOp.ksh script as follows: ./NetOp.ksh -u dbusername -p dbpassword -d dbspec diagnostics serverID The resulting NetOp EMS node connection status report includes the following data: IP address Username Node ID, type, and version ALAPI node type and version Connection status and reason for failed status Channel information (ALAPI, event, and log) Historical connection status Management status

1543-APR 901 0382 Uen B1 | 2009-10-16

9

NetOp System Administrator Guide

Time stamps (managed, unmanaged, and connection)

The following is a sample request for a node connection report: ./NetOp.ksh -u netop -p passwd -d jdbc:sybase:Tds:masterd ataservice:5025\?user=netop\&password=passwd diagnostics EmsServer The following is a sample node connection status report: Trying to obtain diagnostics from server EmsServer... 15:24:40,520 INFO Jam:? - Directing output to : stdout 2009-02-20 15:24:40,520 Jam.INFO : Directing output to : stdout 15:24:40,528 INFO Jam:? - NetOp Software Version Number: 6.1.4.2.17 2009-02-20 15:24:40,528 Jam.INFO : NetOp Software Version Number: 6.1.4.2.17 Retrieving diagnostics from EmsServer Node Connection Diagnostics (Gathered from EmsServer at Fri Feb 20 15:25:10 PST 2009) =================================================================== Node foo2: Current Connection: Reported by: EmsServer/Proxy2 Managed at: Wed Feb 18 15:15:49 PST 2009 (Node assigned to proxy) Current State: Connection Failure Connection Attempt Time: Fri Feb 20 15:24:51 PST 2009 Connection Parameters: IP Address: /2.2.2.2 Username: netop Node Type: SE800 Router/6.1.4.1 Connection Failure: Reason: Cannot establish ALAPI channel: java.net.SocketTimeoutException: connect timed out Connection Initialization: Failed Initialization Progress Connecting to the ALAPI socket ... Failed Channels: ALAPI Channel: Closed ALAPI Channel: Failure: Failed to establish connection: java.net.SocketTimeoutException: connect timed out Event Channel: Connection not attempted Event Channel: Log Channel: Connection not attempted Log Channel:

10

1543-APR 901 0382 Uen B1 | 2009-10-16

Collecting NetOp EMS Diagnostic Information

Connection History [1 of 1]: Reported by: EmsServer/Proxy2 Managed at: Wed Feb 18 15:15:49 PST 2009 (Node assigned to proxy) Current State: Disconnected Connection Attempt Time: Wed Feb 18 15:15:50 PST 2009 Connection Parameters: IP Address: /2.2.2.2 Username: netop Node Type: SE800 Router/6.1.4.1 Connection Failure: Reason: Cannot establish ALAPI channel: java.net.SocketTimeoutException: connect timed out Repeated failure count: 2470 First failure occurred at: Wed Feb 18 15:17:00 PST 2009 Last failure occurred at: Fri Feb 20 15:24:51 PST 2009 Connection Initialization: Failed Initialization Progress Connecting to the ALAPI socket ... Failed Channels: ALAPI Channel: Closed ALAPI Channel: Failure: Failed to establish connection: java.net.SocketTimeoutException: connect timed out Event Channel: Connection not attempted Event Channel: Log Channel: Connection not attempted Log Channel:

------------------------------------------------------------------Node vlad: Current Connection: Not Connected Connection History [1 of 9]: Reported by: EmsServer/Proxy1 Managed at: Wed Feb 18 15:15:46 PST 2009 (Node assigned to proxy) Unmanaged at: Fri Feb 20 15:18:30 PST 2009 (Node unassigned from proxy) Current State: Disconnected Connection Attempt Time: Wed Feb 18 15:15:47 PST 2009 Connection Parameters: ALAPI Node Type: SE400 Router/6.1.4.2.18 IP Address: /10.192.17.247 Username: netop Node Type: SE400 Router/6.1.4.2.18

1543-APR 901 0382 Uen B1 | 2009-10-16

11

NetOp System Administrator Guide

Connection Failure: Reason: Connection broken: Broken pipe Connection Initialization: Completed Channels: ALAPI Channel: Closed ALAPI Channel: tls Failure: Dropped Connection: Broken pipe Event Channel: Closed Event Channel: tls Log Channel: Closed Log Channel: tls

3.4

Node Connection Failure MessagesTable 1, lists the types of reasons for the node connection failures that appear in the node connection status report. Table 1 Node Connection Failure Types, Possible Reasons, and Resolutions Reasons and Resolution The NetOp daemon (netopd) is not running on the node. This failure could be caused by a variety of network problems, including: The NetOp daemon stopped. Resolution: Restart the NetOp daemon with the process restart netopd command in exec mode. The firewall is blocking port 6565. Resolution: Open the port to establish an ALAPI channel. Another proxy is managing the node. The IP address of the node is incorrect. Cannot establish event channel Firewall problem (port 6566). Resolution: Open the port on the firewall. Firewall problem (port 6567). Resolution: Open the port on the firewall.

Failure Type Cannot establish ALAPI channel

Cannot establish log channel

12

1543-APR 901 0382 Uen B1 | 2009-10-16

Collecting NetOp EMS Diagnostic Information

Failure Type Handshake failed

Reasons and Resolution The following reasons are possible: Software compatibility. The SmartEdge OS version is incompatible with the node proxy versions that are running. Resolution: Start a node proxy server that is compatible with your version of the SmartEdge OS. Username or password mismatch. Resolution: Check that the username and password match what is expected.

Connection attempt stopped

The node connection attempts are unsuccessful. Resolution: Change the node status to unmanaged then back to managed. The node connection attempts are unsuccessful. Resolution: Change the node status to unmanaged then back to managed. The following reasons are possible: Network routing problem. Resolution: Check network connectivity and resolve any issues in the network. New firewall rules were introduced. Resolution: Ensure that the firewall is configured correctly to allow open communication. Network congestion is causing packet loss. Resolution: Identify and fix any network issues. The NetOp daemon on the node stopped running. Resolution: On the affected node, enter the show process netopd command to verify that the NetOp daemon is running. If it is not, restart it using the process restart netopd command in exec mode.

Connection attempt stopped

Connection broken

1543-APR 901 0382 Uen B1 | 2009-10-16

13

NetOp System Administrator Guide

Failure Type Connection manager shutdown

Reasons and Resolution The node proxy server is shutting down. Resolution: None required. This state is transitory. Unstable network problem. Resolution: None required. The NetOp software automatically disconnects any open connections and then tries to reestablish them. Unknown Resolution: None required. The NetOp software automatically disconnects any open connections and then tries to reestablish them.

Invalid or unexpected connection state

Unknown connection failure

14

1543-APR 901 0382 Uen B1 | 2009-10-16

Managing NetOp Node Proxies

4

Managing NetOp Node ProxiesThis section describes the configure_hierarchy.sh script to be used in the event that one or more node proxy used by the NetOp MC needs to be reconfigured. The NetOp MC communicates with SmartEdge routers using node proxy software that is specific to a release of the SmartEdge OS software. By default, two instances of the proxy for the SmartEdge OS, Release 6.1.4 are started when you online the NetOp MC. Each proxy instance is capable of supporting the maximum number of nodes managed by the NetOp MC. When two instances of the same proxy are running, the load is balanced between the two instances until one fails. If a proxy instance fails and then is restored, the load is not automatically rebalanced; to rebalance the load, you must offline and then online the NetOp MC. The NetOp MC provides three node proxy versions to support the following SmartEdge OS releases: SmartEdge OS, Release 6.1.4 (the default) SmartEdge OS, Release 6.1.5 SmartEdge OS, Release 6.2.1

Use the configure_hierarchy.sh script, located in the /opt/ericsson/nms_n etop/NetOpSrv/0.0.0.0 directory, to: Identify the current status of the node proxies; see Section 4.1 on page 15. Reconfigure the node proxies; see Section 4.2 on page 16.

In order to connect to the NetOp database, the configure_hierarchy.sh script requires users to provide the value "jdbc:sybase:Tds:masterdataser vice:5025\?user=netop\&password=redback", which identifies the Sybase database and the NetOp MC user.

4.1

Displaying the Current NetOp EMS HierarchyDisplaying the current NetOp EMS hierarchy identifies the NetOp EMS components that are configured. The node proxies are the lowest items in the hierarchy and the only ones that can be reconfigured. To identify the current status of node proxies with the configure_hierarchy.sh script: 1. Log on to the NetOp EMS server host as nmsadm.

1543-APR 901 0382 Uen B1 | 2009-10-16

15

NetOp System Administrator Guide

2. Open a terminal window and navigate to the /opt/ericsson/nms_netop/N etOpSrv/0.0.0.0 directory. 3. Enter the following command: ./configure_hierarchy.sh "jdbc:sybase:Tds:masterdataser vice:5025\?user=netop\&password=passwd" The following output appears: ------EMS Server Hierarchy Setup-----Configure Hierarchy 1) Display Hierarchy 2) Configure Ems Group 3) Configure Server Group 4) Configure Ems Server 5) Configure Proxy 6) Quit Please enter a choice [1-6]: 4. Enter 1 Output similar to the following appears: --> EmsGroup |--> ServerGroup |--> EmsServer |--> Proxy2 |--> Proxy1

4.2

Reconfiguring the Current Node Proxy VersionUse the Configure Proxy option of the configure_hierarchy.sh script to reconfigure the node proxy. Each node proxy is configured separately. Two scenarios that require the node proxies to be reconfigured are possible: Two instances of the same node proxy version are required, but the required version is not the default one. All of the SmartEdge routers are using the same SmartEdge OS; either Release 6.1.5 or 6.2.1. One instance of two node proxy versions are required, because some nodes are using one SmartEdge OS release, and some are using another.

To reconfigure the current node proxy using the configure_hierarchy.sh script: 1. Start the configure_hierarchy.sh script; see Section 4.1 on page 15. 2. At the Please enter a choice [1-6]: prompt, enter 5. Current list of proxies in database:

16

1543-APR 901 0382 Uen B1 | 2009-10-16

Managing NetOp Node Proxies

1) Proxy2 [ server group: ServerGroup host: masterservice port: 9609 max node: 125 version: 6.1.4.1 management status: MANAGED ] 2) Proxy1 [ server group: ServerGroup host: masterservice port: 9608 max node: 125 version: 6.1.4.1 management status: MANAGED ] Enter an action [(a)dd, (e)dit, (r)emove, (c)ancel]: 3. Enter e. 4. At the prompt, identify the proxy you wish to reconfigure.: Which proxy do you wish to edit? (1-2) [(c)ancel]: Enter 1 or 2, as appropriate. 5. The script allows for several values to be configured. The only value you want to reconfigure is the installed proxy version.

For example, to replace the default SmartEdge OS version used for Proxy1 with SmartEdge OS, Release 6.2.1.0: Enter the Unicast Port Number [current value 9609]: Enter the Max Node Number (max 256 nodes) [current value 125]: Installed Proxy Versions: 1) 6.1.4.1 2) 6.1.5.1 3) 6.2.1.0 Enter the Proxy version (1-3) [current value 6.1.4.1]: 3 Please choose a proxy management status: 1) Managed 2) Unmanaged Please enter a choice [1-2] [current value MANAGED]: You have entered: 9609 125 6.2.1.0 Managed Are the above values correct? [y/n]: y Proxy Proxy2 [ host: masterservice port: 9609 ] updated. Current list of proxies in database: 1) Proxy2 [ server group: ServerGroup host: masterservice port: 9609 max node: 125 version: 6.2.1.0 management status: MANAGED ] 2) Proxy1 [ server group: ServerGroup host: masterservice port: 9608 max node: 125 version: 6.1.4.1 management status: MANAGED ] Enter an action [(a)dd, (e)dit, (r)emove, (c)ancel]: 6. Optionally, enter e at then prompt to repeat the previous step and change the proxy version for the second proxy instance. Otherwise, enter c then 6 to exit the script..

1543-APR 901 0382 Uen B1 | 2009-10-16

17

NetOp System Administrator Guide

7. Use the Self Management Tool to restart the NetOp MC; for example: smtool -coldrestart netop_ems -reason=planned -reasontext="proxy config" For information on onlining and offlining the NetOp MC, see Self Management User Guide Reference [3].

18

1543-APR 901 0382 Uen B1 | 2009-10-16

Glossary

Glossary

Glossary The OSS Glossary is included in Operations Support System (OSS) Glossary, Reference [1].

1543-APR 901 0382 Uen B1 | 2009-10-16

19

NetOp System Administrator Guide

20

1543-APR 901 0382 Uen B1 | 2009-10-16

Reference List

Reference List

[1] [2] [3] [4]

Operations Support System (OSS) Glossary, 0033-AOM 901 017/2 OSS Library Typographic Conventions, 1/154 43-AOM 901 017/4 Self Management User Guide, 1/1553-APR 901 951 TSS, System Administrator Guide, 1543-APR 901 0003

1543-APR 901 0382 Uen B1 | 2009-10-16

21