introduction to online diagnostics for dell … · introduction to online diagnostics for dell...

4
www.dell.com/powersolutions Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. DELL POWER SOLUTIONS 87 SYSTEMS MANAGEMENT D iagnostics that help identify hardware failures and changes in a system’s condition can be critical for administrators who must minimize server maintenance and maximize uptime and reliability. Deploying an effi- cient, effective diagnostics program can help reduce system downtime considerably . The Dell OpenManage Online Diagnostics pro- gram is an integral part of Dell OpenManage Server Administrator software. This program consists of a suite of diagnostic test modules that run locally on a Dell PowerEdge server and can be accessed remotely over a network. Diagnostic tests can be selected from a hierarchical menu representing the hardware that the Online Diagnostics program discovers on a Dell PowerEdge server. Tests can be run simultaneously or sequentially in a single session. In addition, adminis- trators can view progress and results for each selected test or hardware component. Figure 1 shows the Diag- nostic Selection screen in Dell OpenManage Server Administrator . Benefits of Dell OpenManage Online Diagnostics Administrators can install and use Dell OpenManage Online Diagnostics to diagnose performance issues with their hardware. For instance, a Dell-supplied serial modem might not be performing at the specified speed. An admin- istrator can use the device tree in Online Diagnostics and run diagnostics on the entire subset of devices that might be causing the modem issue. In this example, the administrator would run diagnos- tics on the modem and its parent, the serial port. If the modem is at fault, the modem diagnostic program would fail one or more of the commands written to it and flag an error to the administrator. However, if the problem is caused by an improper serial port setting (such as a low baud rate), then the serial port diagnostics would fail, indicating a meaningful error to the administrator. In this manner, the Online Diagnostics program can help admin- istrators and Dell tech support staff to isolate hardware issues and prevent false service dispatches, thus helping to reduce service costs. BY PRATHAP THATHIREDDY AND SRIKRISHNA SRIDHAR MURTHY Introduction to Online Diagnostics for Dell PowerEdge Servers Dell OpenManage Online Diagnostics is a comprehensive, cross-platform diagnos- tics program designed to enhance operation of Dell PowerEdge servers and help reduce service costs. This article introduces Dell OpenManage Online Diagnostics, describes its features, and discusses administration scenarios. Related Categories: Dell OpenManage Dell PowerEdge servers Diagnostics Systems management Troubleshooting Visit www.dell.com/powersolutions for the complete category index.

Upload: buikhanh

Post on 30-Aug-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Online Diagnostics for Dell … · Introduction to Online Diagnostics for Dell PowerEdge Servers ... A powerful systems management tool The Dell OpenManage On line

www.dell.com/powersolutions Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. DELL POWER SOLUTIONS 87

SYSTEMS MANAGEMENT

Diagnostics that help identify hardware failures and

changes in a system’s condition can be critical for

administrators who must minimize server maintenance

and maximize uptime and reliability. Deploying an effi-

cient, effective diagnostics program can help reduce

system downtime considerably.

The Dell OpenManage Online Diagnostics pro-

gram is an integral part of Dell OpenManage Server

Administrator software. This program consists of a

suite of diagnostic test modules that run locally on a

Dell PowerEdge server and can be accessed remotely

over a network. Diagnostic tests can be selected from

a hierarchical menu representing the hardware that

the Online Diagnostics program discovers on a Dell

PowerEdge server. Tests can be run simultaneously or

sequentially in a single session. In addition, adminis-

trators can view progress and results for each selected

test or hardware component. Figure 1 shows the Diag-

nostic Selection screen in Dell OpenManage Server

Administrator.

Benefits of Dell OpenManage Online DiagnosticsAdministrators can install and use Dell OpenManage

Online Diagnostics to diagnose performance issues with

their hardware. For instance, a Dell-supplied serial modem

might not be performing at the specified speed. An admin-

istrator can use the device tree in Online Diagnostics and

run diagnostics on the entire subset of devices that might

be causing the modem issue.

In this example, the administrator would run diagnos-

tics on the modem and its parent, the serial port. If the

modem is at fault, the modem diagnostic program would

fail one or more of the commands written to it and flag

an error to the administrator. However, if the problem is

caused by an improper serial port setting (such as a low

baud rate), then the serial port diagnostics would fail,

indicating a meaningful error to the administrator. In this

manner, the Online Diagnostics program can help admin-

istrators and Dell tech support staff to isolate hardware

issues and prevent false service dispatches, thus helping

to reduce service costs.

BY PRATHAP THATHIREDDY AND SRIKRISHNA SRIDHAR MURTHY

Introduction to Online Diagnostics for Dell PowerEdge Servers

Dell™ OpenManage™ Online Diagnostics is a comprehensive, cross-platform diagnos-

tics program designed to enhance operation of Dell PowerEdge™ servers and help

reduce service costs. This article introduces Dell OpenManage Online Diagnostics,

describes its features, and discusses administration scenarios.

Related Categories:

Dell OpenManage

Dell PowerEdge servers

Diagnostics

Systems management

Troubleshooting

Visit www.dell.com/powersolutions

for the complete category index.

Page 2: Introduction to Online Diagnostics for Dell … · Introduction to Online Diagnostics for Dell PowerEdge Servers ... A powerful systems management tool The Dell OpenManage On line

SYSTEMS MANAGEMENT

DELL POWER SOLUTIONS Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. February 200688

Devices supported by Dell OpenManageOnline DiagnosticsThe Dell OpenManage Online Diagnostics program provides

diagnostics for several Dell-supplied and Dell-certified hard-

ware devices.

CMOS RAM diagnostics. CMOS (complementary metal-oxide

semiconductor) memory contains system configuration information.

The CMOS diagnostic program performs a checksum test of the

CMOS memory to determine whether any bytes are corrupt.

CD/DVD diagnostics. The CD/DVD diagnostic test identifies

drive-related mechanical problems such as those affecting the drive

door, spindle motor, and fault-sector and read functions.

Serial port diagnostics. Serial port diagnostics identify any

issues, such as baud rate, related to serial port configurations. They

also cover communication issues such as internal loop back, inter-

rupt handling, and internal registers. These diagnostics can be used

to diagnose performance-related issues that result from improper

configuration of serial devices.

Parallel port diagnostics. This test diagnoses parallel port

configurations and communication-related issues. It can be used

to diagnose performance-related issues that result from improper

configuration of parallel devices.

Modem diagnostics. Modem diagnostics analyze communica-

tion registers on the modem. This also helps to diagnose any modem

hardware issues in sending and receiving commands.

Network interface controller diagnostics. This test is used

to diagnose any network communication and configuration-

related issues. These advanced diagnostics can be very helpful

to administrators in resolving issues with network configuration

on Dell servers.

Memory diagnostics. Memory diagnostics are designed to

test the system memory’s storage integrity and its ability to store

data accurately. This test verifies that data paths, error-correction

circuits, and memory devices are working correctly.

Dell Remote Access Controller (DRAC) diagnostics. The DRAC

diagnostic test provides IT administrators with continuous access

to remote Dell servers. These diagnostics analyze DRAC hardware

and communication issues.

USB controller diagnostics. USB controller diagnostics are

designed to identify hardware and communication-related issues

of USB controllers and any attached devices.

Floppy disk diagnostics. Floppy disk diagnostics detect prob-

lems with floppy disk controllers and their related components, such

as the motor, read/write mechanism, and floppy disk sectors.

PCI diagnostics. PCI diagnostics detect any driver-related

errors or interrupt request (IRQ) sharing warnings for PCI devices

in the system.

RAID controller diagnostics. Dell PowerEdge RAID controller

diagnostics report problems with RAID controller hardware, batter-

ies, and attached disks.

SCSI controller diagnostics. SCSI controller diagnostics detect

problems with SCSI controller hardware and attached devices such

as SCSI hard drives, tape drives, and autoloaders.

Features of Dell OpenManage Online DiagnosticsThe Dell OpenManage Online Diagnostics program provides IT

administrators with various features to perform diagnostic tests.

CLI- and GUI-based testsDiagnostic tests can be performed using OS-supported command-

line interfaces (CLIs) locally or remotely over Telnet or Secure Shell

(SSH). This feature can be helpful to administrators when they are

scripting diagnostics using well-known scripting tools. Graphical

user interface (GUI)–based diagnostics can be performed via HTTP

over Secure Sockets Layer (HTTPS) ports using well-known brows-

ers such as Microsoft Internet Explorer or Mozilla Firefox.

Device enumeration and test/device inventoryThis feature enables system administrators to re-inventory all sup-

ported devices after hardware reconfiguration. This feature can

prove helpful after adding components such as plug-and-play

devices, installing drivers for supported hardware components,

or implementing hot-swappable devices. The test/device selection

feature allows administrators to select desired tests and specify the

devices on which to run them.

Diagnostic schedulerThe diagnostic scheduler feature can help increase system uptime

by allowing administrators to select diagnostic tests to run at specific

times and frequencies. This can help administrators schedule server

maintenance and run appropriate diagnostics without affecting

Figure 1. Dell OpenManage Server Administrator Diagnostic Selection screen

Page 3: Introduction to Online Diagnostics for Dell … · Introduction to Online Diagnostics for Dell PowerEdge Servers ... A powerful systems management tool The Dell OpenManage On line

SYSTEMS MANAGEMENT

www.dell.com/powersolutions Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. DELL POWER SOLUTIONS 89

business deliverables. Figure 2 displays the Dell OpenManage

Server Administrator Diagnostic Scheduling screen.

Diagnostic test review, test status, and result historyThe diagnostic test review option lets administrators review the

selected diagnostic tests before submission for execution. This

means that administrators can change settings that are predefined.

Other options include choosing test-specific settings such as halt-

on-error, specifying the number of iterations a test should run, or

even scheduling a test to run at a later time. Figure 3 displays the

Diagnostic Selection Review screen.

The diagnostic test status feature enables administrators to

monitor the status of the diagnostic tests that are running, allowing

them to view the progress of each test under execution. Figure 4

displays the Diagnostic Status screen.

The diagnostic result history feature allows administrators to

view the result history log. This log file contains the results of

previously run diagnostics tests. This log can help administrators

monitor events such as warnings, device failures, and the work-

ing condition of the device under test. The log file size can be

configured to a maximum of 5 MB. Figure 5 shows the Diagnostic

Result History screen.

Hardware configuration changes and change historyThe hardware configuration changes feature gives administrators

the option to view changes that have occurred to testable devices

on the system since the last reboot, restart of the secure port server,

or re-enumeration. It also reports changes in system configuration

such as the addition or removal of a hard drive. Figure 6 displays

the Diagnostic Hardware Configuration Changes screen.

Figure 2. Dell OpenManage Server Administrator Diagnostic Scheduling screen

Figure 3. Dell OpenManage Server Administrator Diagnostic Selection Review screen

Figure 4. Dell OpenManage Server Administrator Diagnostic Status screen

Figure 5. Dell OpenManage Server Administrator Diagnostic Result History screen

Page 4: Introduction to Online Diagnostics for Dell … · Introduction to Online Diagnostics for Dell PowerEdge Servers ... A powerful systems management tool The Dell OpenManage On line

SYSTEMS MANAGEMENT

DELL POWER SOLUTIONS Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. February 200690

The change history feature enables administrators to view a

log file that contains a history of hardware configuration changes.

The log file size can be configured to a maximum size of 5 MB.

Figure 7 shows the Diagnostic Hardware Configuration Change

History screen.

Configuration optionsThe Dell OpenManage Online Diagnostics program can specify

options for running diagnostic tests. Both application settings and

test-execution settings can be specified. Figure 8 shows the Diag-

nostic Application Settings screen.

Options available under application settings include remote

method invocation (RMI) registry port, result history log size, and

hardware change history log size. Options available under test-

execution settings include halt execution on first error, quick test,

the number of passes, and runtime.

A powerful systems management toolThe Dell OpenManage Online Diagnostics program is an integrated

component within Dell OpenManage Server Administrator enter-

prise server management software. It provides a rapid method by

which administrators can diagnose hardware malfunctions and

identify solutions to such problems. IT administrators can use the

Online Diagnostics program to help reduce management costs and

enhance server management.

Prathap Thathireddy is a senior engineering analyst in the Product Group Test Engineering Group at Dell. He has a B.S. in Computer Maintenance and Engineering from Osmania University in Hyderabad, India. He has seven years of IT experience as a system administrator, technology consultant for storage software, and test engineer.

Srikrishna Sridhar Murthy is an engineering analyst in the Online Diagnostics Group at Dell. He has a B.E. in Computer Science from Birla Institute of Technology and Science in Pilani, India.

FOR MORE INFORMATION

Dell OpenManage:www.dell.com/openmanage

Figure 6. Dell OpenManage Server Administrator Diagnostic Hardware ConfigurationChanges screen

Figure 7. Dell OpenManage Server Administrator Diagnostic Hardware ConfigurationChange History screen

Figure 8. Dell OpenManage Server Administrator Diagnostic Application Settingsscreen