bios post trouble shooting guide.pdf

Upload: vinay2211

Post on 01-Jun-2018

291 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    1/17

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    2/17

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    3/17

    3 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Table of contentsRevisions ............................................................................................................................................................................................. 2

    Executive summary .......................................................................................................................................................................... 4

    1.

    BIOS Splash Screen Display........................................................................................................................................... 4

    2.

    POST Error and Warning Messages ............................................................................................................................. 6

    3. Post Code in iDRAC Web GUI....................................................................................................................................... 9

    4. Driver Health Status Report ......................................................................................................................................... 10

    5. Dell Diagnostics (ePSA) ................................................................................................................................................ 12

    6. Red Screen of Death (RSOD) ...................................................................................................................................... 14

    7. Yellow Screen of Death (YSOD) ................................................................................................................................. 16

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    4/17

    4 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Executive summary

    The Unified Extensible Firmware Interface (UEFI) is a set of industry-standard firmware interfaces that is

    designed to replace the legacy BIOS to support modern operating systems and hardware architectures.

    Dell has been shipping UEFI support in the BIOS since the 11thgeneration of PowerEdge servers through a

    UEFI-over-Legacy model, where it is the legacy BIOS that initializes the whole system and loads the UEFIlayer at the end of Power-On Self-Test (POST) if needed. The Dell Lifecycle Controller technology is built

    upon UEFI as well.

    The BIOS on the 13thgeneration of Dell PowerEdge servers is now a native UEFI implementation, with a

    Compatibility Support Module (CSM) to provide legacy BIOS interfaces to support operating systems that

    are not UEFI-aware. The look and feel of the boot process is dramatically different from the previous

    generations.

    This guide provides troubleshooting solution for possible issues that may arise during POST and pre-boot

    environment on the 13thgeneration of PowerEdge servers.

    1.BIOS Splash Screen DisplayAfter the system is powered on, the Dell server BIOS may get to video display almost instantly. Fig. 1 is a

    sample snapshot of the POST splash screen. The text next to the progress bar on the bottom of the screen

    indicates various phases of POST. The text can aid in troubleshooting issues that happen during the

    system boot process.

    The following table lists the currently supported progress texts in the BIOS:

    Text Display Phase of the Boot ProcessInitializing Intel QuickPath Interconnect...

    BIOS performs an early initialization of the chipset,

    processors, and QPI interfaces.Configuring Memory BIOS initializes the system memory.

    Loading BIOS Drivers BIOS starts the Driver Execution Environment (DXE)

    phase, loads and executes DXE drivers to perform

    additional chipset, processor and hardware initializations.Initializing iDRAC BIOS waits for iDRAC to become ready. This phase may

    take more than a few seconds on the first AC power on of

    the system.Initializing iDRAC Done

    iDRAC initialization has completed.Initializing PCIe, USB and Video Start of PCI enumeration and detection of USB keyboard

    devices.Initializing PCIe, USB and Video Done PCI and USB enumeration has completed.Legacy PCI option ROM initialization (BIOS

    boot mode only)

    Applies to the BIOS boot mode only. The onscreen

    display varies, depending on the type of PCIe cards thatare installed in the system.

    Testing Memory (X% Complete) Software-based memory test phase. A percent progress .

    Note: The memory test is disabled in the BIOS setup by

    default.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    5/17

    5 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Testing Memory Done [No Errors] Memory test completed without any issue.

    Testing Memory Done [Errors Encountered] Memory test has found error(s).TestingMemory Aborted

    Memory test was aborted by pressing or spacebar

    .Loading Lifecycle Controller Drivers

    BIOS loads the Lifecycle Controller drivers.Loading Lifecycle Controller Drivers Done

    BIOS has finished loading the Lifecycle Controller drivers.

    Initializing Firmware Interfaces

    BIOS connects the UEFI drivers to the device handles. TheUEFI drivers from add-in PCIe cards are expected to be

    installed in this phase.Running In-System Characterization...

    In-System Characterization (ISC) is in progress.Connecting iSCSI device(s) the UEFI iSCSI device drivers are connected. This display

    applies to UEFI boot mode only. It gets displayed when an

    iSCSI boot device(s) has been configured.Enumerating Boot options

    BIOS starts to enumerate Boot Options in the system.Enumerating Boot options Done

    The enumeration of Boot Options has completed.Entering Lifecycle Controller

    The system is booting into the Lifecycle Controller.Lifecycle Controller: Applying Updates or

    Setting System Configuration

    An Automated Task Application is being scheduled in the

    Lifecycle Controller.Lifecycle Controller: Collecting System

    Inventory

    Lifecycle Controller is collecting system inventory for this

    boot.Lifecycle Controller: Done

    Lifecycle Controller has finished execution.Booting

    BIOS has finished POST and is giving control to the

    operating system.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    6/17

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    7/17

    7 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 2 An error message box in early POST

    If the issue is detected at a later time in POST, corresponding error and warning messages aredisplay ed

    on the screen with a UEFIxxxx prefix. An event entry is logged in the Lifecycle Controller log (LC log) as

    well. Depending on the severity of the error/warning, the system may proceed with continuing boot, or

    prompt with F1/F2/F10/F11 for user input, or reset, or halt. The message comprisesof two parts, the

    error/warning message itself, and a recommended response action. You can follow the corresponding

    recommended response action to address the issue. For a complete list of POST error and warning

    messages, see theEvent and Error Message Reference Guide for 13 thGeneration Dell PowerEdge Servers.

    In the following example, the UEFI driver for the Integrated Network card is not signed. The user has just

    turned on Secure Boot in BIOS setup utility. In the next boot, a few error messages are displayed on the

    screen during POST.

    - The first error message (UEFI0072) displays that the UEFI driver from the Integrated NIC 1 Port 1

    Partition 1 was not loaded because it failed the Secure Boot authentication. You may address this issue

    by updating the NIC firmware to a version that supports the UEFI driver signing.

    http://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereghttp://en.community.dell.com/techcenter/systems-management/w/wiki/lifecycle-controller#attributereg
  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    8/17

    8 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    -

    The second error message (UEFI0071) displays that the previously configured UEFI network boot

    interface is no longer available. This is a result of the corresponding UEFI driver not being loaded.

    - The third warning message (UEFI0074) displays that the Secure Boot policy has been modified since

    the last time the system was booted. In this particular example, the user enabled Secure Boot on

    purpose, so no action needs to be taken.

    Fig. 3 An example of POST error messages

    Corresponding logs for the error and warning messages will be recorded in the Lifecycle Log (Fig. 4).

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    9/17

    9 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 4 Screen shot of the Lifecycle Log

    3.

    Post Code in iDRAC Web GUIIn case you cannot get to the screen display, the Post Codefeature available in the iDRAC web GUI may

    come handy. This page displays the last system POST code with a descriptive text. POST code helps to

    detect pre-video hangs, report fatal errors, and analyze system failures during POST.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    10/17

    10 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 5 An example of the Post Code in the iDRAC Web GUI

    4.

    Driver Health Status ReportThe UEFI specification defines a Driver Health Protocol (DHP). The DHP provides services allowing a UEFI

    driver to express health status of a controller, return status messages associated with the health status,

    perform repair operations if necessary and request configuration changes to place the controller back in a

    usable state.

    Dell server BIOS checks the driver health status of each UEFI driver in the system, and displays the status

    messages . The BIOS may invoke the repair and configuration utility if a repair or reconfiguration operation

    is required. In most cases, you can follow the instructions on the screen to proceed.

    Fig. 6 is an example display where the BIOS halts on some errors returned from DHP. In this particular

    example, the iDRAC DHP detected that the backplane 2 power cable has been disconnected; The LSI SAS

    controller requires configuration changes, possibly due to a catastrophic issue.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    11/17

    11 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 6 Example of errors detected by UEFI Driver Health Protocol

    The following (Fig. 7) is a snapshot of the Driver Health Manager in the case when a driver requires

    configuration change. The Driver Health Manager lists all the device instances that require reconfiguration.

    You can select each one of them and follow the instructions on the screen to configure the devices.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    12/17

    12 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 7 Driver Health Manager

    5.

    Dell Diagnostics (ePSA)Dell Enhanced Pre-Boot System Diagnostics (ePSA) are diagnostics tests that are embedded in the system

    (Fig. 8). These tests allow you to check the hardware health status outside the operating system

    environment. The findings of this diagnostics can assist you in troubleshooting the fault and working

    toward a resolution to the issue.

    The ePSA can be launched from the Boot Manager-> System Utilities-> Launch Diagnostics(Fig. 9).

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    13/17

    13 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 8 Sample screen shot of ePSA

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    14/17

    14 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 9 Launching diagnostics from Boot Manager

    6.

    Red Screen of Death (RSOD)The Dell server BIOS implements an enhanced CPU exception handler (RSOD) which aids the user and

    tech support to analyze the software exception when the system crashes in the pre-boot UEFI

    environment. The debug information is displayed on the screen and additional information and stack

    traces can be retrieved through the serial port (if available). You can save the dump and use it for

    debugging offline.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    15/17

    15 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    A sample RSOD display is depicted in Fig. 10.

    Fig. 10 An example of the RSOD screen shot

    When an exception is raised by the processor the BIOS displays the RSOD screen with the following

    information related to the exception.

    The exception type, such as Page Fault, General Protection Fault, Divide by Zero,

    Breakpoint, and so on.

    A Dell-defined error value, pre-fixed with UEFIxxxx.Note a corresponding error will be

    logged to the LC log as well.

    Partial register set (x86 64bit).

    Last-Branch records and associated module names if available.

    Current RIP and Faulting driver module name

    Stack trace back from faulted module.

    Additional information is available from the serial port dump. To retrieve the serial dump, you can connect

    the server to a client system with a null modem cable and use any terminal program (for example, Putty or

    HyperTerminal) with the baud rate set to 115200 bps, then press . The serial dump can be

    retrieved from Serial over LAN (SOL) method as well.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    16/17

    16 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Note: The RSOD serial dump can be obtained at the point of failure. The serial session does not have to

    be started prior to the RSOD.

    RSOD are usually caused by software issues, and may be resolved by updating the BIOS, Lifecycle

    Controller, or the UEFI firmware for PCIe cards. You may send the screen shot and serial dump to Dell

    support for further analysis, should you encounter a RSOD even after all the firmware updates.

    7.

    Yellow Screen of Death (YSOD)When a hardware error occurs during UEFI pre-boot environment (excluding CSM phase in BIOS boot

    mode), the Dell server BIOS may display a Yellow Screen of Death (YSOD) with some of the software

    contexts at the time when the issue is detected.

    The hardware errors include Nonmaskable Interrupt (NMI) and Machine Check Errors (MCE). You should

    check the System Event Log (SEL) to identify the source and type of the error. Update the corresponding

    device firmware if the error is originated from a PCIe device.

    Note: The stack trace displayed on the YSOD screen only provides some context information before the

    failure, and not the source of the problem.

    A sample YSOD is depicted in Fig. 11.

  • 8/9/2019 BIOS POST trouble shooting guide.pdf

    17/17

    17 Troubleshooting Guide for BIOS POST on Dell 13thGeneration of PowerEdge Servers

    Fig. 11 An example of the YSOD screen shot