welcome technical services virtual boot camp session 8

116
Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Welcome Technical Services Virtual Boot Camp Session 8 Technical Services India Team

Upload: kalia-neal

Post on 04-Jan-2016

56 views

Category:

Documents


4 download

DESCRIPTION

Welcome Technical Services Virtual Boot Camp Session 8. Technical Services India Team. Technology ·  Architecture Overview UCS C-series 
UCS B- series ·       UCS Interoperability Hardware
Software ·        Troubleshooting Case Study (Lab Demo) Q&A. Cisco Support Community. - PowerPoint PPT Presentation

TRANSCRIPT

Cisco Confidential 1© 2010 Cisco and/or its affiliates. All rights reserved.

Welcome

Technical Services Virtual Boot Camp

Session 8

Technical Services India Team

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2

Recap – Session 7 (18th Feb)

Process

Technology

CiscoSupport

Community

Technology       · Architecture Overview UCS C-seriesUCS B-series

· UCS Interoperability Hardware Software

· TroubleshootingCase Study (Lab Demo)

Q&A

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3

Course Material

https://supportforums.cisco.com/docs/DOC-37994 ...PPT

https://supportforums.cisco.com/videos/7517 ....Video

https://supportforums.cisco.com/docs/DOC-37851 ...Q&A

Process

Technology

CiscoSupport

Community

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4

Today Agenda (Session -8)

Process

Technology

CiscoSupport

Community

Technology       · Firmware Install and upgrade UCS C-series

UCS B-series

· TroubleshootingCase Study (Lab Demo)Important logsPart Identification and RMA

Q&A

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5

Introduction

Nirmal Sodani Technical Support Manager

Mohit Mmangal Manager, CSC

Avinash Shukla TAC Escalation Engineer

Vinay Sharma Lead, CSC

Teclus D'Souza TAC Escalation Engineer

Chetan Badami Technical Escalation Engineer

Cisco Confidential 6© 2010 Cisco and/or its affiliates. All rights reserved.

Technology – UCS

Avinash Shukla

Teclus D'Souza

Chetan Badami

Process

Technology

CiscoSupport

Community

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 7

Agenda

UCS Upgrade Procedure C-series

B-series

UCS Troubleshooting UCSM / FI / IOM / Blade

C-series

UCS H/W and S/W Interoperability

© 2010 Cisco Systems, Inc. All rights reserved. CAE BootcampPresentation_ID 8

UCS H/W and S/W Interoperability

Avinash ShuklaCisco TAC

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 9

Operating System Check the support matrix before installing the OS on the blade

Install / keep the drivers (Eth / FC) updated as per the matrix

Few important things to check:

–Is the blade running the certified OS and OS version?

–Are there any special needs for that OS? E.g. VMWare – OEM Image

–Are the drivers at the OS level updated and current?

Answer:

–UCS S/W and H/W matrix

–http://www.cisco.com/web/techdoc/ucs/interoperability/matrix/matrix.html

–http://www.cisco.com/en/US/products/ps10477/prod_technical_reference_list.html

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 10

H/W and S/W Interop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 11

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 12

What each matrix provides

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 13

Sample..driver versions

© 2010 Cisco Systems, Inc. All rights reserved. CAE BootcampPresentation_ID 14

UCS Upgrades

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 15

Agenda

C series firmware upgrade Pre-requisites Firmware ISO location and downloading Upgrade process

B series firmware upgrade Pre-requisites Firmware bundles and downloading Upgrade process Additions / Modifications from version 2.1

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 16

Pre-requisites C Series

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 17

Things to consider

Release Notes will cover gotchas and concerns in the upgrade process

Upgrades from one version back will always work

Check release notes about prior versions–If customer is really far behind it might require two upgrades to get to current code

Schedule an maintenance window–CIMC and server will reboot during upgrade

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 18

Firmware ISO

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 19

C Series Upgrade

Downloading iso file

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 20

Upgrade processC Series

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 21

Map the iso on the KVM

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 22

Boot from Virtual Media

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 23

HUU Screen and options

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 24

HUU Screen and options

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 25

After all component upgrade

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 26

Verify Upgrade To verify check that all components are upgraded

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 27

Pre-requisitesB Series

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 28

Things to consider

Release Notes will cover gotchas and concerns in the upgrade process

Upgrades from one version back will always work

Check release notes about prior versions–If customer is running a very old version, it might require two upgrades to get to current code

Schedule an maintenance window–FI and IOM will reboot during upgrade

–Make sure network and storage fabric are redundant

Highly recommended to backup UCSM configuration

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 29

Be patient

Upgrade process is not quick

Sometimes bugs will result in the first release after FCS

Expect a maintenance release shortly after FCS

Follow the upgrade procedure for each version–The procedure is not always the same from one version to another.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 30

Downgrading

Sometimes there might be data loss

Might have to erase config to downgrade–Database changes in new versions cannot always be back ported

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 31

Bundles

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 32

Bundles

Prior to 1.4 there was only all inclusive bundle

Now there are multiple bundles–Infra-bundle – contains code for FI, IOM, and UCSM

–B-series bundle – contains BIOS and blade specific code

–C-series bundle – contains BIOS and rack server specific code

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 33

Bundles

All firmware work is done from Equipment tab in UCSM

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 34

Bundles

Packages can be viewed/deleted from “Packages” tab

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 35

Bundles Bundles are downloaded from the “Download Tasks” tab

Downloads can be through desktop or using ftp/scp/sftp/tftp

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 36

Cisco.com to download FCS bundles

B-Series packages

C-Series packages

FI, IOM, and UCSM software

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 37

Pre 1.4 bundles are single download

1.0-1.3 bundles

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 38

Upgrade processB Series

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 39

Upgrade Process

Again always consult release notes

Upgrade through GUI is easiest

General Process is• Backup UCS Config (Full & All Config)

• Download code

• Update components

• Activate components in order of (Check RN cause order can change)• Interface cards – Set Startup Only

• CIMC

• IOM – Set Startup Only

• UCSM

• FI

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 40

Updating Components Update means copy new code to backup location of all

UCSM components

Simply stages the new code

Can update all components at once

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 41

Updating Components Time to update will vary based on component

IOMs take a long time. Up to 5 minutes

If any component has issues check FSM for that component

Update process does not work on FI

Once everything is in “Ready” state you can move to Activate

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 42

Activate Components In this process you activate the code that you copied

Some code is activated but set to “activate on next reboot”

Understand that in this stage you can create outages

Activate “leaves of the tree” first–BU uses this term to mean that order should be

• Interface card = leaf

• CIMC = twig

• IOM = branch

• UCSM = trunk

• FI = root

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 43

Activate Blade Components Recommended Method is to use Policies

–Host Firmware Policy to apply latest BIOS, Board Controller, Adapters, etc.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 44

Activate Interface cards Set to “Set to startup version only”

If you uncheck above box it will cause a blade reboot!!!

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 45

Activate CIMC

CIMC can bet activated without disruption to OS on blade

KVM session will be lost while activating

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 46

Activate IOM

Same as Interface card “Set Startup Version Only”

IOM needs to be at same version as FI!!!

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 47

Activate UCSM

Will cause UCSM to disconnect

Takes a few minutes

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 48

Activate Fabric Interconnect

Recommended to activate one FI at a time

A complete outage will not occur Fail one fabric

Wait for all Network and FC traffic failover to second Fabric

Highly recommended to have an outage window Biggest risk is SAN storage FI will upgrade and reboot

Part of the process is to reboot connected IOM as well

Can take up to 10-15 minutes for FI and all IOM to come back online

If any failure during first FI upgrade STOP! Do not attempt to upgrade second FI

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 49

Activate Fabric Interconnect

Activate FI from Equipment tab

Upgrade subordinate first

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 50

Activate Fabric Interconnect

Choose correct Kernel and System Version

FI will take a few minutes and then reboot

IOMs will get updated as well

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 51

Verify Fabric Interconnect upgrade

Make sure IOM and FI all match the correct running version

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 52

Upgrade Primary Fabric Interconnect

Upgrade the Primary FI now using same process

UCSM will failover to subordinate FI

Will need to log back in to UCSM

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 53

Problems

Biggest concern is a failed IOM upgrade–There is no way in field to upgrade an IOM manually

–RMA the failed IOM

–Can attempt a physical reseat of IOM

Failed FI upgrade can be recovered–Similar to N5K will require access to console and tftp server to boot from

–Refer to FI recovery method

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 54

Host Firmware

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 55

Host firmware

Highly recommended that Blade BIOS match running UCSM system

Best way to upgrade BIOS is through Host Firmware Policy

Create policy in UCSM

Apply policy to SP

Will reboot the blade so need outage window

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 56

Create Host Firmware Policy

From Server Tab

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 57

Host Firmware Policy

Note that Firmware Policy can include–Adapters, BIOS, Board Controller, FC Adapters, HBA Option ROM and Storage Controller

Note how Adapters and FC adapters can be part of a policy

–If adapters are part of policy then they can only be changed as part of firmware policy

Recommended to upgrade BIOS and Storage Controller at a minimum

Board adapter rarely changes and is specific to B230 and B440

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 58

Set BIOS versions

Best to choose all hardware

Set BIOS to the latest in the pull down for each blade/server

Latest BIOS version will be different for some servers

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 59

Add the new Firmware Policy to a SP Select the Host Firmware policy

Blade will reboot once you “Save Changes”

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 60

Additions / Modifications from version 2.1

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 61

– Firmware Auto Install

– Install Infrastructure Firmware

– Install Server Firmware

We just made it simple to upgrade

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 62

Firmware Auto-Install Firmware Auto-Install implements package version based upgrades for both UCS

Infrastructure components and Server components

Firmware Auto-Install can not be used to upgrade Management Extensions and Capability Catalog. These are simple occasional updates in UCSM and hence left under user control.

It is a two step process - “Install Infrastructure Firmware” and “Install Server Firmware”.

It is recommended to run “Install Infrastructure Firmware” first and then “Install Server Firmware”

All existing firmware upgrade mechanisms are retained. For users who do not want to use Auto-Install, they can continue to use existing documented way of doing firmware upgrades.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 63

Install Infrastructure Firmware (contd) This is the sequence followed by “Install Infrastructure

Firmware”

1. Upgrade UCSM

2. Update backup image of all IOMs

3. Activate all IOMs with setstartup option

4. Activate secondary Fabric Interconnect

5. Wait for User Acknowledgement***

6. Activate primary Fabric Interconnect

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 64

Install Infrastructure Firmware GUI

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 65

Install Infrastructure Firmware - Cancelling

• A scheduled “Install Infra” operation can be cancelled

• But an “Install Infra” operation which is already “In Progress” can not be cancelled.

• Both GUI and CLI options are available for cancelling.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 66

Install Infrastructure Firmware – User Acknowledgement for primary FI

• “Install Infra” expects an explicit permission from user to start firmware upgrade on primary Fabric Interconnect.

• This is necessary to protect the data path for servers.

• As part of “Install Infra”, secondary FI’s firmware is upgraded first.

• Secondary FI reboots as part of firmware activation.

• After secondary FI comes online, users are expected to check if the data path is ready for a reboot of primary FI

• When users have ensured that the data path is ready, they can acknowledge reboot of primary FI.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 67

Acknowledge Primary FI reboot

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 68

Install Server Firmware

• Install-Server offers a way to update multiple host firmware packages using package versions.

• It provides the list of Service Profiles that will be affected when a host firmware package is modified. Multiple SPs can use the same host firmware package.

• It also provides a final summary of physical servers that will be rebooted for the set of host firmware packages that are getting modified.

• Only GUI is available for "Install Server Firmware". No CLI.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 69

Install Server Firmware – Screen 1

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 70

Install Server Firmware – Screen 2

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 71

Install Server Firmware – Screen 3

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 72

Install Server Firmware – Screen 4

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 73

Install Server Firmware – Screen 5

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 74

Install Server Firmware – Screen 6

© 2010 Cisco Systems, Inc. All rights reserved. CAE BootcampPresentation_ID 75

Troubleshooting the Cisco Unified

Computing System

Chetan BadamiCisco TAC

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 76

AgendaTroubleshooting UCSM & Fabric Interconnect

Fault types

Clustering issues

Common issues

Blade Servers

IOM & Chassis

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 77

UCS System Components

UCS manager

UCS Fabric Interconnect (6xxx)

UCS Fabric Extenders (2xxx)

UCS 5100 Blade Chassis

UCS B-series servers

Nexus 2000 switch

UCS C-series servers

UCS Network adapters

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 78

UCS 6200 Fabric Interconnect (FI)

Standalone or ClusteredPrimary / Subordinate

Data Management Engine (DME)

FI-B#FI-A#

Virtual IP

IP #BIP #A

Management Network

Cluster links

DBDB

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 79

UCSM

UCSM GUI

CLIUCS-A# scope server x/y

NXOSUCS-A# connect nxos a

UCS-A(nxos)# show…

XML API

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 80

Fault TypesType Description

FSM An FSM task has failed to complete successfully, or Cisco UCS Manager is retrying one of the stages of the FSM.

equipment Cisco UCS Manager has detected that a physical component is inoperable or has another functional issue.

server Cisco UCS Manager cannot complete a server task, such as associating a service profile with a server.

environment Cisco UCS Manager cannot successfully configure a component.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 81

Fault TypesType Description

management Cisco UCS Manager has detected a power problem, thermal problem, voltage problem, or loss of CMOS settings.

connectivity Cisco UCS Manager has detected a connectivity problem, such as an unreachable adapter.

Network Cisco UCS Manager has detected a network issue, such as a link down.

operational Cisco UCS Manager has detected an operational problem, such as a log capacity issue or a failed server discovery.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 82

FarNorth-A# scope server ? WORD <chassis-id>/<blade-id> dynamic-uuid Dynamic UUID

FarNorth-A# scope server 1/1FarNorth-A /chassis/server # show event

Events per Component

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 83

UCSM Faults - GUI

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 84

Information Fault

Major Fault

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 85

Finite State Machine (FSM)

Workflow with many stages

Data Management Engine (DME)… Application Gateway (AG)

… End Point (EP)

<Object><Workflow><Operation><Where-is-it-executed>

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 86

Error Description for that stageStage Description

Operation (workflow)

FSM Details

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 87

Contexts UCS has three CLI “Contexts”

UCSM (GUI Equivalent, uses the “scope” command)

NXOS (not configurable – read only)

Management (file management, tech support, reboot)

UCSM

Local-ManagementNXOS

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 88

Scope Scoping – movement to different UCS configuration components

Details on hardware components done with connect command

You want to be on the

Primary Fabric Interconnect

UCS-B# scope ? adapter Mezzanine Adapter chassis Chassis eth-server Ethernet Server Domain eth-storage Ethernet Storage eth-traffic-mon Ether Traffic Monitoring Domain eth-uplink Ethernet Uplink fabric-interconnect Fabric Interconnect fc-storage FC Storage fc-traffic-mon FC Traffic Monitoring Domain fc-uplink FC Uplink fex FEX (fabric-extender) Module firmware Firmware host-eth-if Host Ethernet Interface host-fc-if Host FC Interface license License monitoring Monitor the system org Organizations power-cap-mgmt Power Cap Mgmt security security mode server Server service-profile Service Profile system Systems vhba vHBA vnic vNIC

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 89

Connect - Hardware Troubleshooting

FarNorth-B# connect

adapter Mezzanine Adapter bmc Baseboard Management Controller (CIMC) clp Connect to DMTF CLP iom IO Module local-mgmt Connect to Local Management CLI nxos Connect to NXOS CLI

Connect – attaches you to hardwareand read only NXOS

FarNorth-A# connect local-mgmt <CR> a Fabric A Defaults to primary b Fabric B

FarNorth-A(local-mgmt)# ? cd Change current directory clear Reset functions cluster Cluster mode connect Connect to Another CLI copy Copy a file cp Copy a file delete Delete managed objects dir Show content of dir enable Enable end Go to exec mode erase Erase erase-log-config Erase the mgmt logging config file exit Exit from command interpreter install-license Install a license ls Show content of dir mkdir Create a directory move Move a file mv Move a file ping Test network reachability pwd Print current directory reboot Reboots Fabric Interconnect rm Remove a file rmdir Remove a directory run-script Run a script show Show running system information ssh SSH to another system tail-mgmt-log Tail mgmt log file telnet Telnet to another system terminal Set terminal line parameters top Go to the top mode traceroute Traceroute to destination

Most dangerous

-erase configuration - reboot

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 90

Connect NXOS

Used to assist in troubleshooting – very familiar to IOS and Nexus - all the show commands

Used to run advised debugs – By TAC

Commands:–Show switch running config (non server config)

–Clear interface counters found on the FI

Cannot be used to configure UCS (read only)

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 91

Connect to NXOSFarNorth-A# connect nxos ? <CR> a Fabric A b Fabric B

Popular examples:

show runshow fex detailshow interfaceshow lacpshow trunkshow cdpdebugshow npv flogi-tableshow mac-address-table

FarNorth-A(nxos)# ? clear Reset functions [Only place to clear counters] cli CLI commands debug Debugging functions debug-filter Enable filtering for debugging functions ethanalyzer Configure cisco packet analyzer interface A live capture will start on following interface no Negate a command or set its defaults ntp NTP configuration show Show running system information system System management commands terminal Set terminal line parameters test Test command undebug Disable Debugging functions (See also debug) end Go to exec mode exit Exit from command interpreter pop Pop mode from stack or restore from name push Push current mode to stack or save it under name where Shows the cli context you are in

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 92

UCSM – Common issues

Is the other FI up and operational?

Are clustering links up?

Is there at least 1 chassis successfully discovered on both FIs?

UCS-A# show cluster extended-state

UCS-A# show pmon state

UCS-A(local-mgmt)# cluster lead a

UCS-A(local-mgmt)# cluster force primary

UCS-A /monitoring/sysdebug # show cores

DME Clustering problems

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 93

Sample – Cluster state

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 94

Sample – Process state (pmon)

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 95

Agenda

Troubleshooting UCSM & Fabric Interconnect

Blade Servers

CIMC/BIOS

OBFL/SEL

IOM & Chassis

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 96

Blade serversBlade overview – Hardware & Software Components

CPU& Heatsink

Memory DIMMS

MezzanineAdapter

CIMC

HDD

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 97

Blade servers

CIMC– Monitors Temperature and Power readings

– KVM & vMedia

– Blade control

BIOS– Can be configured via F2 or via BIOS Policy

Blade overview – CIMC and BIOS

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 98

OBFL

Onboard Fault Log stores hardware logs on the different components, saved at time of issue.

Alternate method to viewed by connecting to the internal component end device.

Show tech-support will capture required logs for support.

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 99

System Event Log (SEL) - Events Supported Server BIOS events

3 Kinds of equipment end-points:

Memory Unit (DIMM) ECC errors, Address Parity, Memory Mismatch

Processor Unit Memory Mirroring, Sparing, SMI Link errors

Motherboard PCIe, QPI uncorrectable errors, Legacy PCI errors

All these errors are modeled as stats properties. The ones for which thresholds are not defined get reported as statistics only

BMC, BIOS, OS log platform errors to CIMC’s System Event Log (SEL) Buffer POST and Run Time errors Used as an Effective health monitoring tool

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 100

System Event Logs Make sure that servers are discoveredMake sure backup destination path is validCan be done via CLI alsoSystem Event Logs = Management Logs on earlier releases

Chassis

Server

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 101

Corrupt CIMC Firmware

Post Failure

Not Completing boot

Connect to CIMC in band manager to diagnose

View Logs, collect tech-support, Monitor KVM output

Manually reboot CIMC

Fault codes: http://www.cisco.com/en/US/partner/docs/unified_computing/ucs/ts/faults/reference/ErrMess.html

CIMC Booting Problems - Blades

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 102

Connecting to CIMC Debug Utility To verify health of blade if questioning

UCSM and wanting to look at lowest level of Blade data points

Used to determine blade components issues at the source.

UCS-A# connect cimc 1/1Trying 127.5.1.1...Connected to 127.5.1.1.Escape character is '^]'.

CIMC Debug Firmware Utility Shell

____________________________________ Debug Firmware Utilityalarmscoresexithelp [COMMAND]imagesmctoolsmemorymessagesnetworkobflpostpowersensorsselfrumezz1frumezz2frutaskstopupdateusersversion

Chassis 1 Server 1 Motherboard CIMC

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 103

Blade servers – Common issues

Server discovery failed– Check minimum software version

– Reseat blade

– Minimum hardware satisfied?

No KVM Video– Does the CIMC have an IP?

Is the BIOS corrupt?– Recover BIOS

– Reset CMOS

UCS-A# show version

UCS-A /system # show capability

UCS-A /chassis/server/cimc # show mgmt-if

UCS-A /chassis/server # show post

UCS-A /chassis/server # reset-kvm

UCS-A /chassis/server # recover-bios <file>

UCS-A /chassis/server # reset-cmos

CIMC issues

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 104

Blade servers – Common issues

Blade won’t boot– Did POST complete?

Types of DIMM errors – Mapped out

– Disabled

– Inoperable

– Degraded

UCS-A# connect cimc x/y

[ help ] # post

[ post ] # obfl

[ obfl ] # sel

UCS-A /chassis/server # show memory

[detail]

UCS-A /chassis/server/memory-array/dimm

# show stats memory-error-stats detail

Hardware issues

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 105

Blade servers – Common issues

Service profile modifications– Firmware updates

– Configuration changes

OS initiated

Hardware issue

IOM / FI issues

Use Maintenance policies to defer changes

Check OS

Unexpected reboot

UCS-A /chassis/server# show fsm status

UCS-A# connect cimc x/y

[ help ] # post

[ post ] # obfl

[ obfl ] # sel

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 106

Blade servers – Top 5 commands

UCS-A /chassis/server # show inventory expand detail

UCS-A /chassis/server # show status detail

UCS-A /chassis/server # show post

UCS-A /chassis/server # show sel

UCS-A /chassis/server# show fsm status

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 107

Agenda

Troubleshooting UCSM & Fabric Interconnect

Blade Servers

IOM & Chassis

Discovery issues

Fan/Thermal/PSU

Tech-support

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 108

IOM & Chassis

CMC responsibilities– Chassis Discovery

– Local cluster management

– Power & Thermal Management

Overview

ChassisManagement

Controller

FLASH

EEPROM

DRAM

Control

IO

ChassisSignals

Switch

1 - 4Fabric linksToInterconnect

To Blades

ASIC

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 109

IOM & Chassis – Common issues

Check chassis discovery policy

Server ports defined correctly

FI to IOM 1:1 relationship only

UCS-A(nxos)# show run interface

ethernet x/y

UCS-A(nxos)# show interface fex-fabric

UCS-A(nxos)# show fex <chassis#> detail

Chassis not discovering

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 110

IOM & Chassis – Common issues

Spinning at 100% – Temperature

– Any fans missing?

– CMC access to thermal sensors

– Component discovery

UCS-A# connect iom 1

fex-1# show platform software cmcctrl thermal status

fex-1# show platform software cmcctrl fancontrol all

fex-1# show platform software cmcctrl ohms all

Fan issues

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 111

Logs for troubleshooting

General UCS issues UCS-A(local-mgmt)# show tech-support ucsm detail

UCS-A(local-mgmt)# show tech-support chassis # all detail

Networking Issues Upstream_Switch# show tech-support details

SAN Issues UCS-A(nxos)# show tech-support npv

MDS# show tech-support details

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 112

UCSM and Chassis show tech from GUI Log into the UCSM GUI

Select the admin tab -> faults, Audit and event-logs section -> Tech Support File

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKCOM-3001 113

Where to find more information

Hardware Installation & Service Guides Information http://www.cisco.com/en/US/docs/unified_computing/ucs/overview/guide/UCS_roadmap.html#wp38892

Release Notes http://www.cisco.com/en/US/products/ps10281/prod_release_notes_list.html

Software Upgrade & Installation Information http://www.cisco.com/en/US/products/ps10281/prod_installation_guides_list.html

UCS Troubleshooting Guide http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/guide/UCSTroubleshooting.html

UCS Faults Reference http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/faults/reference/ErrMess.html

Cisco Support Community https://supportforums.cisco.com/community/netpro/data-center/unified-computing

Troubleshooting UCS C-series

© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 115

Upcoming Sessions…..

Process

Technology

CiscoSupport

Community

March “Month of Routing Protocol Technology” • Session 9 – 11th Mar 2014• Session 10 – 25th Mar 2014

April “Month of Wireless Technology”

And many more……Months and Technologies

Thank you.