troubleshooting cisco switches - bbk-design.rubbk-design.ru/uploads/docs/brkrst-3143.pdf ·...
TRANSCRIPT
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
1
© 2008 Cisco Systems, Inc. All rights reserved. Cisco PublicBRKRST-314314664_05_2008_c2 2
Troubleshooting Cisco Catalyst 6500 Series Switches
BRKRST-3143
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 3BRKRST-314314664_05_2008_c2
Agenda
Sup720 Architecture (A Quick Look)
Layer 2 and Layer 3 Unicast Troubleshooting
Multicast Troubleshooting
Virtual Switch System Troubleshooting
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 4BRKRST-314314664_05_2008_c2
Agenda
Sup720 Architecture (A Quick Look)
Layer 2 and Layer 3 Unicast Troubleshooting
Multicast Troubleshooting
Virtual Switch System Troubleshooting
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
3
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 5BRKRST-314314664_05_2008_c2
Sup720 Architecture
PFC
Supervisor Engine 720
L3Engine
LC-D
BUS
LC-R
BUS
L2Engine
L2 CAM
NetFlow Table
FIB TCAM
Adj CAM
QoS TCAM
ACLTCAM
EARL-DBUSEARL-RBUS
CPU Card
Multicast Expansion Table (MET)
PortASIC
SwitchFabric
Fabric ASIC and
ReplicationEngine
SPCPU
RPCPUController
Controller
EO
BC
18 - 20 Gbps Conns
Port 1 Port 2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 6BRKRST-314314664_05_2008_c2
Agenda
Sup720 Architecture (A Quick Look)
Layer 2 and Layer 3 Unicast Troubleshooting
Multicast Troubleshooting
Virtual Switch System Troubleshooting
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
4
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 7BRKRST-314314664_05_2008_c2
Troubleshooting Unicast Forwarding
(Some) packets don’t get through (drops, incorrect forwarding)
What platform specific counters and tables to check ?
Unwanted floodingDo we learn MAC, are L2 tables in sync ?
High CPU due to SW path forwardingHow do we find out what packets hit the CPU ?
Typical Problems?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 8BRKRST-314314664_05_2008_c2
Troubleshooting Unicast Forwarding
Test topology network diagram
Quick sanity checklist (Layer 2/Layer 3)
Detailed L2 packet flow troubleshootingWhich counters and tables to look at
Detailed L3 packet flow troubleshootingWhich counters and tables to look at
Some useful troubleshooting tools
Unicast L2 and L3 Traffic: What to Check ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
5
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 9BRKRST-314314664_05_2008_c2
Test Topology Network Diagram
DUT is the Device Under Test we are troubleshooting
DUT is a 6509 with Supervisor 720
R1/R2 are neighboring devices
Connections are respectively a 5 x 1 Gigabit Ethernet links and 2 x 1 Ten Gigabit Ethernet port channel
After normal network troubleshooting, conclusion is that DUT has a problem: (some) unicast packets don’t go through ….. Where do we go from there ??
R1 DUT R2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 10BRKRST-314314664_05_2008_c2
Quick Sanity Check
If no up-to-date topology diagram, confirm the connections between DUT and (relevant) neighbors … “show cdp neighbor”can be a good tool
Check for the obvious: Are all modules on line and OK, are links up ?
What does “show proc cpu” say ?
Any log messages ?
Any recent changes in configuration or topology ?
Can we ping the neighboring hops (L3) ?
Do we learn (neighbor) MAC addresses (L2), routes (L3) ?
If nothing obvious, identify traffic flows that are impacted and*should* go through the DUT
Verify the path for impacted flow through DUT
Quickly Understand Situation/Topology/Traffic Flow
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
6
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 11BRKRST-314314664_05_2008_c2
Po1Ten8/1
Ten8/3 Ten8/3
Po2Po2
Gig5/2 Gig7/2Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
L2 Unicast Traffic Network Configuration
host1#sh ip arp 7.0.1.1
Protocol Address Age (min) Hardware Addr Type Interface
Internet 7.0.1.1 - 000b.fca2.fe0a ARPA Vlan700
host1#
host2#sh ip arp 7.0.1.2
Protocol Address Age (min) Hardware Addr Type Interface
Internet 7.0.1.2 - 0011.bced.e400 ARPA GigabitEthernet2/3
host2#
R1 DUT R27.0.1.1 7.0.1.2Vlan700
Host1 Host2
Find MAC address of Host 1 (using router as host; depending on host OS, you can use e.g. arp -a)
Find MAC address of Host 2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 12BRKRST-314314664_05_2008_c2
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
Sanity Check for L2 Unicast Traffic Network Path Verification: Result
Each direction can use different links in the bundles !!
Po1
R1 DUT R27.0.1.1 7.0.1.2Vlan700
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Po2
Host2Host1
R1 DUT R27.0.1.1 7.0.1.2Vlan700
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Po2Po1
Host1 Host2
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
7
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 13BRKRST-314314664_05_2008_c2
Sanity Check for L2 Unicast Traffic Network Path Verification: mac Address Table CheckDUT#show mac-address-table address 000b.fca2.fe0a vlan 700 all
Legend: * - primary entry
age - seconds since last seen
n/a - not available
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
Module 1:
700 000b.fca2.fe0a dynamic Yes 170 Po2
Active Supervisor:
700 000b.fca2.fe0a dynamic Yes 170 Po2
Standby Supervisor:
700 000b.fca2.fe0a dynamic Yes 170 Po2
Module 7[FE 1]:
* 700 000b.fca2.fe0a dynamic Yes 50 Po2
Module 7[FE 2]:
* 700 000b.fca2.fe0a dynamic Yes 170 Po2
Module 8[FE 1]:
700 000b.fca2.fe0a dynamic Yes 170 Po2
Module 8[FE 2]:
700 000b.fca2.fe0a dynamic Yes 170 Po2
DUT#sh interface po2 | i Members
Members in this channel: Gi7/2 Gi7/3 Gi7/4 Gi7/5 Gi7/6
Repeat this for MAC address of Host2
Check MAC addresses are present in all Forwarding Engines in the system (PFC/DFC) … if not, possibly flooding !!
Primary entry: the MAC is learned on an interface tied to the L2 Forwarding Engine (module 7 is ingress line card for packets coming from this MAC); if ingress line card is CFC (doesn’t have local FE), ingress FE is the PFC of the active supervisor
Which physical link in the port channel really receives the flow ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 14BRKRST-314314664_05_2008_c2
Sanity Check for L2 Unicast Traffic Network Path Verification: mac Address Table Check
DUT#show mac-address-table address 0011.bced.e400 vlan 700 all
…
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
Active Supervisor:
700 0011.bced.e400 dynamic Yes 265 Po1
Standby Supervisor:
700 0011.bced.e400 dynamic Yes 260 Po1
Module 7[FE 1]:
700 0011.bced.e400 dynamic Yes 265 Po1
Module 7[FE 2]:
700 0011.bced.e400 dynamic Yes 265 Po1
Module 8[FE 1]:
* 700 0011.bced.e400 dynamic Yes 230 Po1
Module 8[FE 2]:
* 700 0011.bced.e400 dynamic Yes 260 Po1
DUT#sh interface po1 | i Members
Members in this channel: Te8/1 Te8/3
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
8
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 15BRKRST-314314664_05_2008_c2
Sanity Check for L2 Unicast Traffic Network Path Verification: Which EtherChannel Links?
R1#show etherchannel load-balance
EtherChannel Load-Balancing Configuration:
dst-ip
mpls label-ip
EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Destination MAC address
IPv4: Destination IP address
IPv6: Destination IP address
MPLS: Label or IP
R1#remote command switch test etherchannel load-balance interface po1 ip 7.0.1.2
Computed RBH: 0x1
Would select Gi8/1 of Po1
Repeat same steps for finding links used in Po2, Po1 on DUT and Po2 on R2 in both directions (to 7.0.1.2 and to 7.0.1.1)
R1 DUT R27.0.1.1 7.0.1.2Vlan700
Ten8/1
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1Po1
Ten8/1
Ten8/3 Ten8/3
Po2
Host1 Host2
Check load balancing configuration used; default is src-dst-ip Check load balancing
configuration used
Mode is “dst-ip”, so only destination IP as argument. As of 12.2(33)SXH, new CLI added to RP: show etherchannel load-balance hash-result … (same arguments); one can use remote login switch (instead of remote command);
Link selected is Gi8/1 in Po1 of R1 for traffic to 7.0.1.2 leaving R1
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 16BRKRST-314314664_05_2008_c2
Sanity Check for L2 Unicast Traffic Network Path Verification: Which EtherChannel Links?DUT#show etherchannel load-balance
EtherChannel Load-Balancing Configuration:
dst-ip
mpls label-ip
EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Destination MAC address
IPv4: Destination IP address
IPv6: Destination IP address
MPLS: Label or IP
DUT#remote command switch test etherchannel load-balance int po1 ip 7.0.1.2
Computed RBH: 0x1
Would select Te8/3 of Po1
DUT#remote command switch test etherchannel load-balance int po2 ip 7.0.1.1
Computed RBH: 0x2
Would select Gi7/4 of Po2
R2#show etherchannel load-balance
…
R2#remote command switch test etherchannel load-balance int po2 ip 7.0.1.1
Computed RBH: 0x2
Would select Te8/1 of Po2
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
9
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 17BRKRST-314314664_05_2008_c2
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
Sanity Check for L2 Unicast Traffic Network Path Verification: Result
Each direction can use different links in the bundles !!
Po1
R1 DUT R27.0.1.1 7.0.1.2Vlan700
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Po2
Host2Host1
R1 DUT R27.0.1.1 7.0.1.2Vlan700
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Po2Po1
Host1 Host2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 18BRKRST-314314664_05_2008_c2
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
L3 Unicast Traffic Network Configuration
DUT is the Device Under Test we are troubleshootingDUT is a 6509 with Supervisor 720 R1/R2 are neighboring devices Connections are respectively a 5 x 1 Gigabit L2 Ethernet Port Channel carrying VLAN’s 701 to 705 and 2 x 1 L3 Ten Gigabit linksRunning equal cost multi path routing with respectively 5 and 2 equal cost pathsDUT has a problem: (some) unicast packets don’t go through …..
R1 DUT R28.0.1.1 9.0.1.2
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Host2Host1
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
10
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 19BRKRST-314314664_05_2008_c2
Sanity Check for L3 Unicast TrafficR1#sh ip route 9.0.1.0 | i via
Known via "eigrp 700", distance 90, metric 3328, type internal
Redistributing via eigrp 700
* 7.2.1.2, from 7.2.1.2, 00:21:58 ago, via Vlan702
7.5.1.2, from 7.5.1.2, 00:21:58 ago, via Vlan705
7.4.1.2, from 7.4.1.2, 00:21:58 ago, via Vlan704
7.3.1.2, from 7.3.1.2, 00:21:58 ago, via Vlan703
7.1.1.2, from 7.1.1.2, 00:21:58 ago, via Vlan701
R1#sh ip cef exact-route 8.0.1.1 9.0.1.2
8.0.1.1 -> 9.0.1.2 : Vlan701 (next hop 7.1.1.2)
R1#show mls cef exact-route 8.0.1.1 0 9.0.1.2 0
Interface: Vl705, Next Hop: 7.5.1.2, Vlan: 705, Destination Mac: 0050.f0f8.7400
R1#remote command switch test etherchannel load-balance int po1 ip 9.0.1.2
Computed RBH: 0x7
Would select Gi8/2 of Po1
Traffic flow 8.0.1.1 -> 9.0.1.2 leaves R1 on Gi8/2 link, in vlan 705, to next hop 7.5.1.2 for HW CEF switched packets; for SW CEF switched packets, same link, but in vlan 701, to next hop 7.1.1.2Repeat the same steps for finding L3 next hops and links on DUT, and R2, in both directions
Network Path Verification: Which L3 Next Hop/L2 Link ?Which next hop will be actually used by the traffic flow in case of Equal Cost Multi Path routing?
Check next hop used for SW based CEF (SW forwarding data path) for flows 8.0.1.1 -> 9.0.1.2
Check next hop used for HW based CEF (SW forwarding data path) for flows 8.0.1.1 -> 9.0.1.2; source and destination port 0 as test flow was ICMP echo request/replies in example
Check which link between R1 and DUT in 5 port etherchannel, based on etherchannel loadbalance
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 20BRKRST-314314664_05_2008_c2
Sanity Check for L3 Unicast Traffic
DUT#sh ip route 9.0.1.0 | i via
Known via "eigrp 700", distance 90, metric 3072, type internal
Redistributing via eigrp 700
* 7.7.1.2, from 7.7.1.2, 00:07:33 ago, via TenGigabitEthernet8/3
7.6.1.2, from 7.6.1.2, 00:07:33 ago, via TenGigabitEthernet8/1
DUT#sh ip cef exact-route 8.0.1.1 9.0.1.2
8.0.1.1 -> 9.0.1.2 => IP adj out of TenGigabitEthernet8/1, addr 7.6.1.2
DUT#show mls cef exact-route 8.0.1.1 0 9.0.1.2 0 mod 7
Interface: Te8/3, Next Hop: 7.7.1.2, Vlan: 1090, Destination Mac: 000f.f8e4.d000
DUT#sh vlan internal usage | i 1090
1090 TenGigabitEthernet8/3
Traffic flow 8.0.1.1 -> 9.0.1.2 leaves DUT on Ten8/3 linkRepeat the same steps for DUT and R2 (both directions)
Network Path Verification: Which L3 Next Hop?
Next hop for L3 interface is linked to internal vlan; check internal VLAN matches physical interface
Look at ingress line card L3 tables: all of the L3 tables should be in sync, but the lookup happens at the ingress DFC/PFC. In case the ingress module doesn’t have DFC, ingress forwarding engine is the PFC of the active supervisor
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
11
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 21BRKRST-314314664_05_2008_c2
Sanity Check for L3 Unicast Traffic
DUT#sh ip route 8.0.1.0 | i via
Known via "eigrp 700", distance 90, metric 3072, type internal
Redistributing via eigrp 700
* 7.5.1.1, from 7.5.1.1, 00:15:49 ago, via Vlan705
7.4.1.1, from 7.4.1.1, 00:15:49 ago, via Vlan704
7.3.1.1, from 7.3.1.1, 00:15:49 ago, via Vlan703
7.2.1.1, from 7.2.1.1, 00:15:49 ago, via Vlan702
7.1.1.1, from 7.1.1.1, 00:15:49 ago, via Vlan701
DUT#sh ip cef exact-route 9.0.1.2 8.0.1.1
9.0.1.2 -> 8.0.1.1 => IP adj out of Vlan701, addr 7.1.1.1
DUT#show mls cef exact-route 9.0.1.2 0 8.0.1.1 0 mod 8
Interface: Vl705, Next Hop: 7.5.1.1, Vlan: 705, Destination Mac: 0011.bc75.9c00
DUT#remote command switch test etherchannel load-balance int po2 ip 8.0.1.1
Computed RBH: 0x4
Would select Gi7/6 of Po2
Traffic flow 9.0.1.2 -> 8.0.1.1 leaves DUT on Gi7/6 link, in vlan 705
Network Path Verification: Which L3 Next Hop/L2 Link?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 22BRKRST-314314664_05_2008_c2
Sanity Check for L3 Unicast Traffic
R2#sh ip route 8.0.1.0 | i via
Known via "eigrp 700", distance 90, metric 3328, type internal
Redistributing via eigrp 700
* 7.7.1.1, from 7.7.1.1, 00:32:01 ago, via TenGigabitEthernet8/3
7.6.1.1, from 7.6.1.1, 00:32:01 ago, via TenGigabitEthernet8/1
R2#sh mls cef exact-route 9.0.1.2 0 8.0.1.1 0
Interface: Te8/3, Next Hop: 7.7.1.1, Vlan: 4043, Destination Mac: 0050.f0f8.7400
Traffic flow 9.0.1.2 -> 8.0.1.1 leaves R2 on Ten8/3 link
Network Path Verification: Which L3 Next Hop?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
12
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 23BRKRST-314314664_05_2008_c2
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
Sanity Check for L3 Unicast Traffic Network Path Verification: Result
Each direction can use different links !!
R1 DUT R28.0.1.1 9.0.1.2
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Host1 Host2
R1 DUT R28.0.1.1 9.0.1.2
Ten8/1 Ten8/1
Ten8/3 Ten8/3
Host1 Host2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 24BRKRST-314314664_05_2008_c2
What Did We Get from Path Verification?
The physical links the specific traffic flow should come in and leave the DUT, as well as the exact L3 next hops
Caveat: Flapping links in port channel, can change the bundle hash mapping, and change physical path of traffic
Clearing routes can as well change the order in which the L3 adjacencies get re-programmed, and in case of ECMP hence change the physical path of the traffic
=> any of these happen, you need to re-verify the path
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
13
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 25BRKRST-314314664_05_2008_c2
Troubleshooting Unicast Forwarding
Test topology network diagram
Quick sanity checklist (Layer 2/Layer 3)
Detailed L2 packet flow troubleshootingWhich counters and tables to look at
Detailed L3 packet flow troubleshootingWhich counters and tables to look at
Some useful troubleshooting tools
Unicast L2 and L3 Traffic: What to Check ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 26BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow Troubleshooting
Identify path in the switch Check countersVerifying L2 forwarding tables (HW/SW)
“Verify the Traffic Path in the Switch”Ten8/3
DFC3
WS-X6704Module 8
PortASIC
FabricInterface &Replication
EngineMETMET
FabricInterface &Replication
Engine
PortASIC
PortASIC
PortASIC
4 x 1x10GE port asic
Switch FabricWS-X6748Module 7
PortASIC
FabricInterface &Replication
EngineMETMET
FabricInterface &Replication
Engine
PortASIC
PortASIC
PortASIC
4 x 12xGE port asic
DFC3
L3/4Engine
Layer 2Engine
Layer 2Engine
Layer 2Engine
Layer 2Engine
L3/4Engine
Gig7/4
Ten8/1 EOBC
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
14
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 27BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow TroubleshootingIdentify the “Traffic Path in the Switch”: Which Fabric Channels?
DUT#sh fabric fpoe interface Gi 7/4
fpoe for GigabitEthernet7/4 is 15
DUT#sh fabric fpoe interface ten 8/1
fpoe for TenGigabitEthernet8/1 is 16
DUT#sh fabric fpoe interface ten 8/3
fpoe for TenGigabitEthernet8/3 is 7
DUT#sh fabric fpoe map
slot channel fpoe
… … …
7 0 6
7 1 15
8 0 7
8 1 16
… … …
Gig7/4 maps to slot 7, fabric channel 1, Ten8/1 maps to slot 8, channel 1, Ten 8/3 to slot 8, channel 0
For each in/egress interface identified in the path verification, find Fabric Port Of Exit (FPOE) the interface maps to
Find what fabric channel the relevant FPOE’s map to, and from previous command, what fabric channel maps to what interface
Ten8/3
Mod 8
FabricMod 7
L3/4
L2
L2L2
L3/4Gig7/4
Ten8/1
L2
Slot 7, channel 1
Slot 8, channel 1Slot 8, channel 0
?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 28BRKRST-314314664_05_2008_c2
EOBC
Detailed L2 Packet Flow TroubleshootingCounters and L2 Tables Overview
Ten8/3
DFC3
WS-X6704Module 8
PortASIC
FabricInterface &Replication
EngineMETMET
FabricInterface &Replication
Engine
PortASIC
PortASIC
PortASIC
4 x 1x10GE port asic
Switch FabricWS-X6748Module 7
PortASIC
FabricInterface &Replication
EngineMETMET
FabricInterface &Replication
Engine
PortASIC
PortASIC
PortASIC
4 x 12xGE port asic
DFC3
L3/4Engine
Layer 2Engine
Layer 2Engine
Layer 2Engine
Layer 2Engine
L3/4Engine
Gig7/4
Port counters
Port counters
L2 Engine counters& Tables
Port counters
Fabric counters
Fabric counters Fabric countersChannel0
Channel1
Channel1
L2 Engine counters& Tables
Ten8/1
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
15
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 29BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow TroubleshootingVerify L2 Counters: Interface Counters (Port asic)
DUT#clear counters
DUT#clear vlan 700 counters
DUT#sh int gi 7/4 count
Port InOctets InUcastPkts InMcastPkts InBcastPkts
Gi7/4 249784 2000 8 40
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts
Gi7/4 245614 2000 6 0
DUT#sh int ten 8/3 count
Port InOctets InUcastPkts InMcastPkts InBcastPkts
Te8/3 10590 18 28 0
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts
Te8/3 246449 2000 10 0
DUT#sh int ten 8/1 count
Port InOctets InUcastPkts InMcastPkts InBcastPkts
Te8/1 273441 2032 174 0
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts
Te8/1 2890 0 11 0
DUT#
Cleared interface counters (port level) just for illustration
Cleared L2 Forwarding Engine VLAN counters just for illustration
Did a ping (2000 packets/100 bytes per packet) from 7.0.1.1 -> 7.0.1.2, verify interface counters relevant to the path did move sufficiently !!
Ten8/3
Mod 8
4 x 10GE port asic
FabricMod 7
4 x 12xGE port asic
L3/4
L2
L2L2
L3/4Gig7/4
Ten8/1
L2? ?
?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 30BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow TroubleshootingVerify L2 Counters: L2 Forwarding Engine Vlan Count
DUT#sh vlan id 700 counters
* Multicast counters include broadcast packets
Vlan Id : 700
L2 Unicast Packets : 4000
L2 Unicast Octets : 472000
L3 Input Unicast Packets : 0
L3 Input Unicast Octets : 0
L3 Output Unicast Packets : 0
L3 Output Unicast Octets : 0
L3 Output Multicast Packets : 0
L3 Output Multicast Octets : 0
L3 Input Multicast Packets : 0
L3 Input Multicast Octets : 0
L2 Multicast Packets : 0
L2 Multicast Octets : 0
DUT#sh interface <interface> counter errors
DUT#sh counters interface <interface>
VLAN is bidirectional, so counts both directions of the flow (7.0.1.1 <-> 7.0.1.2
SNMP like interface countersInterface level errors (e.g. OutDiscards …)
Ten8/3
Mod 8
4 x 10GE port asic
FabricMod 7
4 x 12xGE port asic
L3/4
L2
L2L2
L3/4Gig7/4
Ten8/1
L2
??
??
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
16
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 31BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow TroubleshootingVerify L2 Forwarding Engines Counters
DUT#remote command mod 7 show platform hardware earl statistics
Superman 0 Forwarding statistics:
Forwarded Frames = 0x0000000016E3E0E8 (384033000)
Frames fwd'ed to Tycho = 0x000000000BA62E47 (195440199)
L3 results rcvd = 0x000000000BA62E47 (195440199)
. . .
. . .
Src Mac misses = 0x000000000425D50C (69588236)
Dst Mac misses = 0x0000000005340140 (87294272)
line full encountered during New l = 0x0000000000000000 (0)
. . .
correctable errors in bank 0 = 0x0000000000000000 (0)
uncorrectable errors in bank 0 = 0x0000000000000000 (0)
correctable errors in bank 1 = 0x0000000000000000 (0)
uncorrectable errors in bank 1 = 0x0000000000000000 (0)
DBus Header Checksum errors = 0x0000000000000000 (0)
address of the line full = 0x00000204
address of the last error in Bank0 = 0x00004022
address of the last error in Bank1 = 0x00002040
Superman 1 Forwarding statistics:
Do the counters move ? Check all relevant forwarding engines/modules.
L2 Engine 0 on module 7
Amount of frames forwarded by L2 Engine
Amount of frames that required L3 lookupAmount of L3 lookup results received from L3 Forwarding Engine
Increases per new learn (source mac lookup miss)
Correctable ECC errors upon reading entry in L2 table
Uncorrectable ECC errors upon reading entry in L2 table .. HW
L2 Engine sees bad CRC DBUS header
Unable to learn because all hash buckets full
Increases per flooded packet (destination mac lookup miss)
L2 Engine 1 on module 7
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 32BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow TroubleshootingVerify L2 Counters: Switching Fabric UtilizationDUT#sh fabric status 7
slot channel speed module fabric hotStandby Standby Standby
status status support module fabric
7 0 20G OK OK Y(not-hot)
7 1 20G OK OK Y(not-hot)
DUT#sh fabric status 8
slot channel speed module fabric hotStandby Standby Standby
status status support module fabric
8 0 20G OK OK Y(not-hot)
8 1 20G OK OK Y(not-hot)
DUT#sh fabric utilization detail
Fabric utilization: Ingress Egress
Module Chanl Speed rate peak rate peak
1 0 20G 0% 14% @18:34 17Dec07 0% 13% @14:42 03Jan08
4 0 8G 0% 86% @23:20 17Dec07 0% 100% @10:58 21Dec07
5 0 20G 0% 7% @00:43 18Dec07 0% 27% @10:42 21Dec07
6 0 8G 0% 9% @15:23 17Dec07 0% 16% @16:58 17Dec07
7 0 20G 0% 1% @04:54 22Feb08 0% 1% @02:34 22Feb08
7 1 20G 0% 1% @15:47 21Feb08 0% 6% @18:35 20Mar08
8 0 20G 0% 5% @13:12 21Mar08 0% 6% @16:58 17Dec07
8 1 20G 0% 43% @15:11 26Dec07 0% 29% @13:44 21Dec07
Check utilization (current and last peak value) for relevant fabric channels … did any peak coincide with moment of drops ?
Check status of fabric channels is OK
Gig7/4
Gig8/3
Gig8/1
Ten8/3
Mod 8
4 x 10GE port asic
FabricMod 7
4 x 12xGE port asic
L3/4
L2
L2L2
L3/4Gig7/4
Ten8/1
L2
? ???
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
17
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 33BRKRST-314314664_05_2008_c2
line card fabric ASIC reports bad packets: card inserted properly ? A few incrementing ‘rxErrors', which is not correlated to any network events, is OK & acceptable
Detailed L2 Packet Flow TroubleshootingVerify L2 Counters: Relevant Fabric Channels
DUT#sh fabric channel-counters 7
slot channel rxErrors txErrors txDrops lbusDrops
7 0 0 0 0 0
7 1 0 0 0 0
DUT#sh fabric errors 7
Module errors:
slot channel crc hbeat sync DDR sync
7 0 0 0 0 0
7 1 0 0 0 0
Fabric errors:
slot channel sync buffer timeout
7 0 0 0 0
7 1 0 0 0
DUT#sh fabric channel-counters 8 …
DUT#sh fabric errors 8 …
unable to send packets from fabric to line card:Check traffic levels, line card OK ?
fabric interface unable to send packets from local bus to fabric (Supervisor and 65XX modules only – not 67XX, 67XX will report Overruns in “show interface”): check traffic levels, congestion ?
fabric serial link bit errors (8 serial links in each fabric channel), reported as soon as 2 fabric serial link interrupts within 100ms; can result in rxErrors/txErrors; check card inserted OK ?
fabric ASIC unable to send traffic to the fabric enabled module for last +3 seconds
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 34BRKRST-314314664_05_2008_c2
Primary entry: indicates ingress module for this MAC address is module 8
2 Layer 2 Engines on module 8
Detailed L2 Packet Flow TroubleshootingVerify L2 TablesDUT#show mac-address-table address 0011.bced.e400 vlan 700 all
Legend: * - primary entry
age - seconds since last seen n/a - not available
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
Module 1:
700 0011.bced.e400 dynamic Yes 35 Po1
Active Supervisor:
700 0011.bced.e400 dynamic Yes 40 Po1
Standby Supervisor:
700 0011.bced.e400 dynamic Yes 40 Po1
Module 7[FE 1]:
700 0011.bced.e400 dynamic Yes 95 Po1
Module 7[FE 2]:
700 0011.bced.e400 dynamic Yes 95 Po1
Module 8[FE 1]:
* 700 0011.bced.e400 dynamic Yes 30 Po1
Module 8[FE 2]: <-
* 700 0011.bced.e400 dynamic Yes 30 Po1
Check all L2 Forwarding Engines MAC tables are in sync (if not … possibly flooding) and correct
Ten8/3
Mod 8
4 x 10GE port asic
FabricMod 7
4 x 12xGE port asic
L3/4
L2
L2L2
L3/4Gig7/4
Ten8/1
L2
? ?
??
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
18
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 35BRKRST-314314664_05_2008_c2
Detailed L2 Packet Flow TroubleshootingVerify L2 Tables
DUT#show mac-address-table learning vlan 700
VLAN Mod1 Mod5 Mod6 Mod7 Mod8
---- ------------------------------
700 yes yes yes yes yes
DUT#show mac-address-table synchronize statistics
MAC Entry Out-of-band Synchronization Feature Statistics:
---------------------------------------------------------
Module [7]
-----------
Module Status:
Statistics collected from module : 7
Number of L2 asics in this module : 2
Global Status:
Status of feature enabled on the switch : on
Default activity time : 160
Configured current activity time : 160
Statistics from ASIC 0 when last activity timer expired:
. . .
Number of active entries read : 41295
Number of entries ignored with update to age byte : 16251
Number of entries updated with age byte : 20217
Number of entries created new : 227
Statistics from ASIC 1 when last activity timer expired: …
Is learning on for the VLAN in all L2 Engines … if not, possible flooding
Off by default, except on WS-X6708 it is on by default
Default value is 160 seconds; normal aging timer should be at least 3x activity interval …so with default of 160 seconds, change aging timer to 480 seconds or more
Number of entries that were synced by SW sync feature
If flooding and MAC address tables not in sync across DFC/PFC’s: check if extra EOBC L2 table SW sync feature (complements HW L2 synchronization) is on, if not try turning on: “mac-address-table synchronize” (sup720 only)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 36BRKRST-314314664_05_2008_c2
Troubleshooting Unicast Forwarding
Test topology network diagram
Quick sanity checklist (Layer 2/Layer 3)
Detailed L2 packet flow troubleshootingWhich counters and tables to look at
Detailed L3 packet flow troubleshootingWhich counters and tables to look at
Some useful troubleshooting tools
Unicast L2 and L3 Traffic: What to Check ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
19
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 37BRKRST-314314664_05_2008_c2
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
L3 Unicast Traffic Network Refresh
DUT is the Device Under Test we are troubleshooting
DUT is a 6509 with Supervisor 720
R1/R2 are neighboring devices
Connections are respectively a 5 x 1 Gigabit Ethernet Port Channel and 2 x 1 Ten Gigabit links,
Running equal cost multi path routing with respectively 5 (Vlans701 – 705) and 2 (L3 Te8/1 and Ten8/3) equal cost paths
R1 DUT R28.0.1.1 9.0.1.2
Ten8/1 Ten8/1
Ten8/3 Ten8/3
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 38BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow Troubleshooting“In the Switch Path”, L3 Counters and Tables
Ten8/3
DFC3
WS-X6704Module 8
PortASIC
FabricInterface &Replication
EngineMETMET
FabricInterface &Replication
Engine
PortASIC
PortASIC
PortASIC
4 x 1x10GE port asic
Switch FabricWS-X6748Module 7
PortASIC
FabricInterface &Replication
EngineMETMET
FabricInterface &Replication
Engine
PortASIC
PortASIC
PortASIC
4 x 12xGE port asic
DFC3
L3/4Engine
Layer 2Engine
Layer 2Engine
Layer 2Engine
Layer 2Engine
L3/4Engine
Gig7/3
Port counters
L2 Engine counters& tables
Port counters
Fabric counters
Fabric countersChannel1
Channel1
Channel0
L2 Engine counters& tables
Gig7/6
L3 Engine counters& Tables
L3 Engine counters& Tables
Similar to L2 check port counters, relevant fabric channels, L2 Engine counters and tables;
Additionally: check L3 Engine counters and tables
EOBC
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
20
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 39BRKRST-314314664_05_2008_c2
ACECounter
PFC3/DFC3
QoSTCAM
FIBTCAMNetFlow
L3/4 Engine
L2 Engine
AdjTCAM
L2 CAM (64K)
ACLTCAM
Detailed L3 Packet Flow TroubleshootingL3 Engine in Detail: Counters and Tables
L3 forwarding tables get programmed by SW: copy of SW forwarding tables in HW
EOBC is used for communication between modules and RP, and program L3 tables
L2 CAM contains
MAC entries
ADJ contains
rewrite info
NetFlow table for stats and
features
QoS TCAM contains QoS ACL entries
FIB contains IPv4/IPv6 prefixes and MPLS
entries
Hardware for ACL TCAM counters
ACL TCAM contains
security and feature ACL
entries
Mod 8
4 x 10GE port asic
FabricMod 7
4 x 12xGE port asic
L3/4L2
L2L2L3/4
Gig7/3
Ten8/3
L2
?
?
Gig7/6
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 40BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingVerify L2 Tables for Router mac AddressDUT#sh int vlan 702 | i address
Hardware is EtherSVI, address is 0050.f0f8.7400 (bia 0050.f0f8.7400)
Internet address is 7.2.1.2/24
DUT#show mac-address-table address 0050.f0f8.7400 vlan 702 all
Legend: * - primary entry
age - seconds since last seen
n/a - not available
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
Module 1:
* 702 0050.f0f8.7400 static No - Router
Active Supervisor:
* 702 0050.f0f8.7400 static No - Router
Standby Supervisor:
* 702 0050.f0f8.7400 static No - Router
Module 7[FE 1]:
* 702 0050.f0f8.7400 static No - Router
Module 7[FE 2]:
* 702 0050.f0f8.7400 static No - Router
Module 8[FE 1]:
* 702 0050.f0f8.7400 static No - Router
Module 8[FE 2]:
* 702 0050.f0f8.7400 static No - Router
Tagged as router MAC, so packets going to that mac address will be L3 HW switched based on PFC3/DFC3 HW content
Check all L2 Forwarding Engines MAC tables have the routed interface macprogrammed as a router MAC; if not, possibly no HW switching
Mod 8
4 x 10GE port asic
FabricMod 7
4 x 12xGE port asic
L3/4
L2
L2L2
L3/4Gig7/3
Ten8/3
L2
? ?
??
Gig7/6
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
21
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 41BRKRST-314314664_05_2008_c2
remote command module <mod> show adjacency <interface> <next hop ip address> detail
show ip arp<next hop ipaddress>
show ip cefadjacency <interface> <next hop ipaddress>
Detailed L3 Packet Flow TroubleshootingL3 FIB Table Programming Flow
show ip route <ip address>
IOS Routing Table (RP)
IOS FIB Table (RP)
IOS FIB Table (SP/DFC)
MLS FIB Table (SP/DFC)
Verify Layer 3 Verify Layer 2 rewrite
IOS ARP Cache Table (RP)
IOS Adjacency Table (RP)
IOS Adjacency Table (SP/DFC)
MLS Adjacency Table (SP/DFC)
show ip cef < ipaddress>
remote command module <mod> show ip cef < ipaddress>…
show mls cef lookup <ip address> <mod> show mls cef adjacency
entry <index> module <mod>
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 42BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table and Counters
DUT#sh ip cef 9.0.1.2 <- SW FIB
9.0.0.0/8
nexthop 7.6.1.2 TenGigabitEthernet8/1
nexthop 7.7.1.2 TenGigabitEthernet8/3
DUT#sh ip cef exact-route 8.0.1.1 9.0.1.2
8.0.1.1 -> 9.0.1.2 => IP adj out of TenGigabitEthernet8/1, addr 7.6.1.2
DUT#sh ip cef adjacency TenGigabitEthernet 8/1 7.6.1.2
7.6.1.2/32
attached to TenGigabitEthernet8/1
9.0.0.0/8
nexthop 7.6.1.2 TenGigabitEthernet8/1
DUT#sh mls cef lookup 9.0.1.2 mod 7
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
108749 9.0.0.0/8 Te8/1 , 000f.f8e4.d000 (Hash: 007F)
Te8/3 , 000f.f8e4.d000 (Hash: 7F80)
DUT#sh mls cef exact-route 8.0.1.1 0 9.0.1.2 0 module 7
Interface: Te8/3, Next Hop: 7.7.1.2, Vlan: 1090, Destination Mac: 000f.f8e4.d000
DUT#show vlan internal usage | i 1090
1090 TenGigabitEthernet8/3
Check HW FIB table on ingress DFC/PFC (module 7 in this case): finds the longest prefix match in HW for … is it consistent with the SW information ?
Which one of the 2 is being used ?
Exact path for SW switched packets
SW adjacency
L3 Interface map internally to a “1-port” VLAN
Displays exact HW load sharing path for the flow … if not UDP or TCP, use port numbers 0, else use correct port numbers !
Which adjacency is used ?
SW
HW
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
22
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 43BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table and Counters
DUT#show adjacency ten 8/3 7.7.1.2 detail
Protocol Interface Address
IP TenGigabitEthernet8/3 7.7.1.2(17)
2001 packets, 228114 bytes
epoch 0
sourced in sev-epoch 774
Encap length 14
000FF8E4D0000050F0F874000800
ARP
DUT#show mls cef lookup 9.0.1.2 detail mod 7
Codes: M - mask entry, V - value entry, A - adjacency index, P - priority bit
D - full don't switch, m - load balancing modnumber, B - BGP Bucket sel
V0 - Vlan 0,C0 - don't comp bit 0,V1 - Vlan 1,C1 - don't comp bit 1
RVTEN - RPF Vlan table enable, RVTSEL - RPF Vlan table select
Format: IPV4_DA - (8 | xtag vpn pi cr recirc tos prefix)
Format: IPV4_SA - (9 | xtag vpn pi cr recirc prefix)
M(108749 ): E | 1 FFF 0 0 0 0 255.0.0.0
V(108749 ): 8 | 1 0 0 0 0 0 9.0.0.0 (A:294933 ,P:1,D:0,m:14,B:0 )
Aggregate HW adjacency statistics (SW collects it from all DFC/PFC’s for all prefixes linked to this adjacency): do they move ?
Rewrite information (Dmac|Smac|0800): verify it is conform with next hop rewrite info
To get HW adjacency statistic for this prefix on this module
Start adjacency pointer is 294933, 14 + 1 = 15 adjacencies linked to the prefix
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 44BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table and Counters
DUT#show mls cef adjacency entry 294933 to 294947 mod 7
Index: 294933 smac: 0050.f0f8.7400, dmac: 000f.f8e4.d000
mtu: 9234, vlan: 1091, dindex: 0x0, l3rw_vld: 1
packets: 0, bytes: 0
. . .
Index: 294947 smac: 0050.f0f8.7400, dmac: 000f.f8e4.d000
mtu: 9234, vlan: 1090, dindex: 0x0, l3rw_vld: 1
packets: 0, bytes: 0
DUT#show mls cef adjacency entry 294933 to 294947 mod 7 | i packets
packets: 0, bytes: 0
packets: 0, bytes: 0
packets: 0, bytes: 0
packets: 0, bytes: 0
packets: 0, bytes: 0
packets: 0, bytes: 0
packets: 0, bytes: 0
packets: 2001, bytes: 236118
. . .
DUT#show mls cef adjacency entry 294940 det mod 7
Index: 294940 smac: 0050.f0f8.7400, dmac: 000f.f8e4.d000
…
packets: 0, bytes: 0
For other direction (to 8.0.1.1), completely similar commands
15 HW adjacencies linked to this prefix: which one is really used ?
The 8th
one … as SW polls this clear on read counters, hard to capture … check if adjacency moves
8th
one reset to 0: SW polls this clear on read counters for “show adjacency”
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
23
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 45BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table Special Entries/Adjacencies
Default entry: 0.0.0.0/0 (“match all”)ALWAYS at bottom of FIB TCAM, if no default route, punt to drop adjacency,
DUT#sh ip route 0.0.0.0 0.0.0.0
% Network not in table
DUT#sh mls cef lookup 123.0.1.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
134368 0.0.0.0/0 drop
DUT#
else default route linked to HW adjacencyDUT#sh mls cef lookup 123.0.1.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
134368 0.0.0.0/0 Vl1200 , 0011.bc75.9c00
DUT#
No default route present
Match-all entry links to drop adjacency, which is subject to rate limiter "ICMP UNREAC. NO-ROUTE": in profile packets get punted to CPU … so possible reason for packets hitting CPU
After adding default route to Vlan1200, adjacency points to next hop, all switched in HW
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 46BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table Special Entries/Adjacencies
Drop adjacency (route to Null0): subject to rate limiter "ICMP UNREAC. NO-ROUTE"
FIB receive (local IP address): subject to rate limiter “CEF RECEIVE”DUT#sh mls cef lookup 7.1.1.2
Index Prefix Adjacency
343 7.1.1.2/32 receive
CEF Glean: subject to rate limiter “CEF GLEAN”DUT#sh ip route 5.0.1.0
Routing entry for 5.0.1.0/24
…
* directly connected, via Vlan1000
Route metric is 0, traffic share count is 1
DUT#sh ip arp 5.0.1.123
DUT#show mls cef lookup 5.0.1.123
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
3212 5.0.1.0/24 glean
If not present, packets for local IP addresses don’t get to RP (SW)
If not present, packets to unresolved IP addresses for directly connected hosts/routers will not get punted to RP (SW) to trigger ARP resolution
Unresolved directly connected host
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
24
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 47BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table Special Entries/Adjacencies
CEF Glean: subject to rate limiter “CEF GLEAN” (continued)DUT#sh ip route 77.0.0.0
Routing entry for 77.0.0.0/8
Known via "static", distance 1, metric 0 (connected)
Redistributing via ospf 100
Routing Descriptor Blocks:
* directly connected, via TenGigabitEthernet8/3
Route metric is 0, traffic share count is
DUT#sh mls cef lookup 77.0.0.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
108750 77.0.0.0/8 glean
DUT#sh ip arp 77.0.0.1
DUT#sh ip arp 77.0.0.1
Protocol Address Age (min) Hardware Addr Type Interface
Internet 77.0.0.1 0 000f.f8e4.d000 ARPA TenGigabitEthernet8/3
DUT#sh mls cef lookup 77.0.0.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
165 77.0.0.1/32 Te8/3 , 000f.f8e4.d000
Another example where we need CEF glean: static route with next hop specified as interface relies on proxy arpon next hop to resolve next hop
Not yet resolved: first packet hits glean entry, goes to RP, triggers ARP resolution; no glean entry present: we keep hitting SW
Resolved via ARP
Host entry for destination based on proxy arpresolution. Static routes like this can use up lots of FIB table entries !!
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 48BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table HW Rate Limiters
Rate limiters: HW rate limit packets pointed to Route Processor, no counter !!
For L3 Unicast:DUT#sh mls rate-limit
Sharing Codes: S - static, D - dynamic
Codes dynamic sharing: H - owner (head) of the group, g - guest of the group
Rate Limiter Type Status Packets/s Burst Sharing
--------------------- ---------- --------- ----- -------
....
IP FEATURES Off - - -
CEF RECEIVE Off - - -
CEF GLEAN Off - - -
IP RPF FAILURE On 100 10 Group:0 S
TTL FAILURE Off - - -
ICMP UNREAC. NO-ROUTE On 100 10 Group:0 S
ICMP UNREAC. ACL-DROP On 100 10 Group:0 S
ICMP REDIRECT Off - - -
MTU FAILURE Off - - -
UCAST IP OPTION Off - - -
IP ERRORS On 100 10 Group:0 S
...
Truncated output: only listed relevant ones for Ip Unicast
Shared (same group) indicates packets matching these types, will be subject to the same HW rate limiter at 100 pps aggregate per DFC/PFC !!
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
25
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 49BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 Engine Counters
DUT#sh mls statistics mod 8
Statistics for Earl in Module 8
L2 Forwarding Engine
Total packets Switched : 1453743937845
L3 Forwarding Engine
Total packets L3 Switched : 1251810539335 @ 0 pps
Total Packets Bridged : 200667165283
Total Packets FIB Switched : 1251810539334
Total Packets ACL Routed : 0
Total Packets Netflow Switched : 1
…
Total packets dropped by ACL : 2
Total packets dropped by Policing : 0
Total packets exceeding CIR : 0
Total packets exceeding PIR : 0
Errors
MAC/IP length inconsistencies : 0
Short IP packets received : 0
IP header checksum errors : 0
TTL failures : 7852668
MTU failures : 200209207135
Check for all modules that have DFC/PFC; lookup is at ingress DFC/PFC
Refer to earlier L2 counters
Total packets and current Packet-Per-Second seen
L2 Switched packets
Forwarded based on FIB TCAM table result, ACL TCAM table result or Netflow TCAM table result
Security ACL drops
QOS ACL drops
Errors pointed to route processor, subject to HW rate limiters
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 50BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 FIB Table uRPF and VRF
Checking uRPF:DUT#sh mls cef rpf 9.0.1.1
RPF information for prefix 9.0.1.1
uRPF check performed in the hardware for interfaces:
TenGigabitEthernet8/1
TenGigabitEthernet8/3
uRPF check disabled for interfaces:
Checking VRF’sDUT#remote com sw sh mls vlan-ram 701 end 701
TYCHO Vlan RAM
Key: * => Set, - => Clear
vlan eom nf-vpn mpls mc-base siteid stats rpf vpn-num bgp-grp l2-metro rpf-pbr-ovr
----+---+------+----+-------+------+-----+---+-------+-------+--------+-----------
701 - - * 0 0 - - 0 0 - *
DUT(config)#int vlan 701
DUT(config-if)#ip vrf forwarding customer-1
DUT#remote com sw sh mls vlan-ram 701 end 701
TYCHO Vlan RAM
Key: * => Set, - => Clear
vlan eom nf-vpn mpls mc-base siteid stats rpf vpn-num bgp-grp l2-metro rpf-pbr-ovr
----+---+------+----+-------+------+-----+---+-------+-------+--------+-----------
701 - - * 0 0 - - 256 0 - *
DUT#show mls cef exact-route vrf ?
WORD VPN Routing/Forwarding instance name
Verify unicast RPF check is performed in HW;
Check interface (vlan 701) is in the correct VRF (VPN value 0: default routing table)
Illustration: move to different VRF, and how to check this got programmed in HW …sometimes issues seen with interface staying in default VRF; check this on each DFC/PFC !!
Use further same commands as with default FIB, specifying VPN with vrf key word
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
26
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 51BRKRST-314314664_05_2008_c2
ACECounter
PFC3/DFC3
QoSTCAM
FIBTCAMNetFlow
L3/4 Engine
L2 Engine
AdjTCAM
L2 CAM (64K)
ACLTCAM
Detailed L3 Packet Flow TroubleshootingWhat Have We (Not Yet) Looked at ?
Verified already at FIB and adjacency tables, as well as L2 CAM table
Still to look at: ACL and Netflow Table
L2 CAM contains
MAC entries
ADJ contains
rewrite info
NetFlow table for stats and
features
QoS TCAM contains QoS ACL entries
FIB contains IPv4/IPv6 prefixes and MPLS
entries
Hardware for ACL TCAM counters
ACL TCAM contains
security and feature ACL
entries
VV? ?
V
?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 52BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL Table and Counters
After configuring Vlan Access Map on vlan 701 to 705DUT#sh vlan access-map DenyHost1<->Host2
Vlan access-map "DenyHost1<->Host2" 10
match: ip address DenyHost1<->Host2
action: drop
Vlan access-map "DenyHost1<->Host2" 20
match: ip address MatchAll
action: forward
DUT#sh tcam int vlan 705 acl in ip mod 7
* Global Defaults not shared
Entries from Bank 0
deny ip host 8.0.1.1 host 9.0.1.2 (87 matches)
permit ip any any (1167 matches)
Entries from Bank 1
Configuration of VLAN access map
Verify the (correct) ACL is present in the HW; remember flow 8.0.1.1 -> 9.0.1.2 came in via Vlan 705, ingress interface gi7/3 but VACL is bidirectional, hence similar out(bound) entry should be present,
ACL drop counter, PFC2/PFC3A don’t have this counter, can get cleared by SW, so look at trend while debuggingDeny/permit keywords; other possibilities:
-redirect: redirect to a specific interface (can be RP, central rewrite …)- punt: point to CPU- policy-route
Useful to find out why packets get punted to SW (RP)
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
27
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 53BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL Table and Counters
After adding Policy Based Routing on Vlan 705 DUT# show route-map PolicyRouteTo9.0.1.2
route-map PolicyRouteTo9.0.1.2, permit, sequence 10
Match clauses:
ip address (access-lists): Select9.0.1.2
Set clauses:
ip next-hop 7.1.1.1
Policy routing matches: 0 packets, 0 bytes
DUT#sh tcam int vlan 705 acl in ip mod 7
* Global Defaults not shared
Entries from Bank 0
deny ip host 8.0.1.1 host 9.0.1.2
permit ip any any
Entries from Bank 1
policy-route ip any host 9.0.1.2
permit ip any any (84 matches)
What does this effectively mean … ?
Policy based routing configuration
Inbound VACL entry
(Inbound) PBR entry
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 54BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow Troubleshooting
Higher and lower Bank
Lookup in BOTH banks, generates 2 results
Result with highest priority wins, if both results have high priority, HI bank wins, if both low, LO bank wins
Single lookup mode: equivalent with single bank result
Serial lookup mode: only apply LO Bank lookup if HI Bank result says permit
144 bits, Packet information
0
16K 32K
16K
RSLT1priority1
LOW bankTCAM(low priority)
HI bankTCAM(hi priority)
RSLT2priority2
L3 ACL Table: ACL TCAM Structure
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
28
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 55BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL Table and Counters
Understanding what is programmed in the ACLDUT#sh tcam int vlan 705 acl in ip detail mod 7
…
Interface: 705 label: 3585 lookup_type: 0
protocol: IP packet-type: 0
+-+-----+---------------+---------------+---------------+---------------+-------+---+----+-+---+--+---+---+
|T|Index| Dest Ip Addr | Source Ip Addr| DPort | SPort | TCP-F |Pro|MRFM|X|TOS|TN|COD|F-P|
+-+-----+---------------+---------------+---------------+---------------+-------+---+----+-+---+--+---+---+
V 17789 9.0.1.2 8.0.1.1 P=0 P=0 ------ 0 ---- 0 0 -- C-- 0-0
M 17792 255.255.255.255 255.255.255.255 0 0 ------ 0 ---- 0 0
R rslt: L2_L3_DENY_RESULT (*) rtr_rslt: L2_L3_DENY_RESULT (*) hit_cnt=0
V 17840 0.0.0.0 0.0.0.0 P=0 P=0 ------ 0 ---- 0 0 -- C-- 0-0
M 17846 0.0.0.0 0.0.0.0 0 0 ------ 0 ---- 0 0
R rslt: PERMIT_RESULT rtr_rslt: PERMIT_RESULT hit_cnt=0 <- Match all entry Hi Bank
V 18396 0.0.0.0 0.0.0.0 P=0 P=0 ------ 0 ---- 0 0 -- --- 0-0
M 18404 0.0.0.0 0.0.0.0 0 0 ------ 0 ---- 0 0
R rslt: L3_DENY_RESULT rtr_rslt: L3_DENY_RESULT hit_cnt=0 Hi Bank
V 31642 9.0.1.2 0.0.0.0 P=0 P=0 ------ 0 ---- 1 0 -- C-- 0-0
M 31643 255.255.255.255 0.0.0.0 0 0 ------ 0 ---- 1 0
R rslt: REDIRECT_ADJACENCY (*) rtr_rslt: PERMIT_RESULT indx: 0x7F803 hit_cnt=0
V 36293 0.0.0.0 0.0.0.0 P=0 P=0 ------ 0 ---- 0 0 -- C-- 0-0 <-
M 36296 0.0.0.0 0.0.0.0 0 0 ------ 0 ---- 0 0 <-
R rslt: PERMIT_RESULT rtr_rslt: PERMIT_RESULT hit_cnt=95 <- <- Match all entry Lo Bank
V 36828 0.0.0.0 0.0.0.0 P=0 P=0 ------ 0 ---- 0 0 -- --- 0-0
M 36836 0.0.0.0 0.0.0.0 0 0 ------ 0 ---- 0 0
R rslt: L3_DENY_RESULT (*) rtr_rslt: L3_DENY_RESULT (*) hit_cnt=17 Lo Bank
Look at ingress module
flow 8.0.1.1 -> 9.0.1.2 first match lookup result in Hi bank: Index 17789, priority set (to high) as indicated by (*)
flow 8.0.1.1 -> 9.0.1.2 first match lookup result in Lo bank: Index 31642, priority set (to high) as indicated by (*)
Apply rule from previous slide: if first match in Lo and Hi Bank have both priority set (to high) , Hi Bank result wins => L2_L3_DENY_RESULT (deny)
Similarly, applying rules, for flow 8.0.1.2 -> 9.0.1.2, entry 31642 will win, REDIRECT_ADJACENCY with index 0x7F803 (policy routing)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 56BRKRST-314314664_05_2008_c2
ACECounter
PFC3/DFC3
QoSTCAM
FIBTCAMNetFlow
L3/4 Engine
L2 Engine
AdjTCAM
L2 CAM (64K)
ACLTCAM
Detailed L3 Packet Flow TroubleshootingWhat Have We (Not Yet) Looked at ?
Verified already at FIB and adjacency tables, as well as L2 CAM table, ACL TCAM/counters
Still to look at: Netflow Table
L2 CAM contains
MAC entries
ADJ contains
rewrite info
NetFlow table for stats and
features
QoS TCAM contains QoS ACL entries
FIB contains IPv4/IPv6 prefixes and MPLS
entries
Hardware for ACL TCAM counters
ACL TCAM contains
security and feature ACL
entries
VV?
V
VV
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
29
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 57BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow Troubleshooting
Feature Interaction Engine process takes care of selecting a correct strategy to program Hi/Lo Bank in case multiple ACL based features are combined on same interface :
HW based like PBR and Security ACL’s
HW assisted like NAT, SLB, TCP intercept, Reflexive ACL … using ACL’s to select traffic that needs to be punted to CPU, SW installs netflow entry in HW to forward consecutive packets for same flow
SW based features: ACL used to punt packets that require SW processing to CPU
Feature Manager process transforms all ACL’s into Value-Mask-Result, and calls merge algorithms to combine multiple ACL based features, outcome programmed into ACL TCAM
If no success full strategy to combine features in FIE (feature conflicts, flow mask conflicts), FIE will move one of the features to SW and re-attempt to find a strategy for the remaining ones
HW assisted/SW features: ACL TCAM identifies packets that needs to be punted to SW for HW assisted (Netflow based) or SW forwarding
L3 ACL and NetFlow Table: Programming the Table
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 58BRKRST-314314664_05_2008_c2
Po2Gig5/2 Gig7/2
Gig7/3Gig8/2Gig7/4Gig8/1Gig7/5Gig8/3Gig7/6Gig8/4
Po1
L3 Unicast Traffic Network Refresh
DUT is the Device Under Test we are troubleshooting
DUT is a 6509 with Supervisor 720
R1/R2 are neighboring devices
Connections are respectively a 5 x 1 Gigabit Ethernet Port Channel and 2 x 1 Ten Gigabit links,
Running equal cost multi path routing with respectively 5 (Vlan 701 to 705) and 2 (Ten8/1 and Ten8/3) equal cost paths
Doing NAT on DUT between Vlan701-705 and Ten8/1,Ten8/3
R1 DUT R28.0.1.2 9.0.1.1
Ten8/1 Ten8/1
Ten8/3 Ten8/3
InsideOutside
10.0.1.3/9.0.1.1
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
30
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 59BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL and NetFlow Tables: HW Assisted Features
NAT SW configuration status: configure NAT between Vlan701-705 and Ten8/1,Ten8/3DUT#sh ip nat stat
Total active translations: 0 (0 static, 0 dynamic; 0 extended)
Outside interfaces:
Vlan701, Vlan702, Vlan703, Vlan704, Vlan705
Inside interfaces:
TenGigabitEthernet8/1, TenGigabitEthernet8/3
Hits: 10 Misses: 0
CEF Translated packets: 10, CEF Punted packets: 0
Expired translations: 2
Dynamic mappings:
-- Inside Source
[Id: 4] access-list FromR2 pool TowardsR1 refcount 0
pool TowardsR1: netmask 255.255.255.0
start 10.0.1.1 end 10.0.1.255
type generic, total addresses 255, allocated 0 (0%), misses 0
longest chain in pool: TowardsR1's addr-hash: 0, average len 0,chains 0/256
DUT#sh ip nat translations
Pro Inside global Inside local Outside local Outside global
tcp 10.0.1.3:19343 9.0.1.1:19343 8.0.1.2:23 8.0.1.2:23
--- 10.0.1.3 9.0.1.1 ---
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 60BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL Table and Hardware Assisted Features
HW assisted forwarding: SW based creation of Netflow entriesDUT#sh mls netflow ip sw-installed mod 7
Displaying Netflow entries in EARL in module 7
DstIP SrcIP Prot:SrcPort:DstPort Src i/f :AdjPtr
-----------------------------------------------------------------------------
Pkts Bytes Age LastSeen Attributes
---------------------------------------------------
8.0.1.2 9.0.1.1 tcp :19343 :telnet Te8/3 :0x8000A
0 0 36 18:47:42 L3 - SwInstalled
10.0.1.3 8.0.1.2 tcp :telnet :19343 Vl701 :0x8000B
5 230 36 18:47:46 L3 – SwInstalled
DUT#sh mls netflow ip sw-installed mod 8
Displaying Netflow entries in EARL in module 8
DstIP SrcIP Prot:SrcPort:DstPort Src i/f :AdjPtr
-----------------------------------------------------------------------------
Pkts Bytes Age LastSeen Attributes
---------------------------------------------------
8.0.1.2 9.0.1.1 tcp :19343 :telnet Te8/3 :0x80028
7 322 38 18:47:47 L3 – SwInstalled
10.0.1.3 8.0.1.2 tcp :telnet :19343 Vl701 :0x80029
0 0 38 18:47:43 L3 - SwInstalled
Based on ACL TCAM, first packet matching NAT criteria gets punted to CPU, and SW NAT’ed; SW then installesNetflow Entry with correct NAT rewrite info into NetflowTCAM; subsequent packets wil hit this one and get forwarded in HW. Check ACL TCAM similar to previous slides (if not OK, no NAT), check presence of SwInstalled Netflow entry HW acceleration of specific NAT translation … (if not, OK, possible reason for high CPU)
Adjacency pointer to adjacency with NAT rewrite info
Adjacency pointer to adjacency with NAT rewrite info
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
31
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 61BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL Table and Hardware Assisted Features
Adjacencies linked to SW installed NetflowDUT#show mls cef adj entry 0x8000B det mod 7
Index: 524298 smac: 0050.f0f8.7400, dmac: 000f.f8e4.d000
mtu: 9234, vlan: 1091, dindex: 0x0, l3rw_vld: 1
format: MAC_IPV4, flags: 0x4008418
ip_sa: 0.0.0.0, ip_da: 9.0.1.1
DUT#show mls cef adj entry 0x80028 det mod 8
Index: 524329 smac: 0050.f0f8.7400, dmac: 0011.bc75.9c00
mtu: 1518, vlan: 701, dindex: 0x0, l3rw_vld: 1
format: MAC_IPV4, flags: 0x2008418
ip_sa: 10.0.1.3, ip_da: 0.0.0.0
“show tcam acl” commands on in/egress interface explain what traffic gets punted to SW because of NAT (first packet(s), till SW installs Netflow entry), follow same logic as in ACL TCAM when interpreting the output
Similar HW assisted features: Reflexive ACL, SLB, TCP intercept … look at the SW installed Netflow entries, as well as the ACL TCAM content for the relevant interfaces
Adjacency contains NAT rewrite info; index is only significant per PFC/DFC !!
Rewrite info says modify IP destination Address
Rewrite info says modify IP source Address
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 62BRKRST-314314664_05_2008_c2
Detailed L3 Packet Flow TroubleshootingL3 ACL Table and Hardware Assisted Features
Checking for feature conflicts (this can cause packet punt to CPU), truncated outputDUT#sh fm fie interface vlan 705
Interface Vl705:
Feature interaction state created: Yes
Flowmask conflict status for protocol IP : FIE_FLOWMASK_STATUS_SUCCESS
Flowmask conflict status for protocol OTHER : FIE_FLOWMASK_STATUS_SUCCESS
Interface Vl705 [Ingress]:
FIE Result for protocol IP : FIE_SUCCESS_NO_CONFLICT
Features Configured : VACL NAT PBR - Protocol : IP
FM Label when FIE was invoked : 355
. . .
Interface Vl705 [Egress]:
FIE Result for protocol IP : FIE_SUCCESS_NO_CONFLICT
Features Configured : NAT VACL - Protocol : IP
FM Label when FIE was invoked : 370
. . .
Check both in/egress L3 interface, both in/egress direction, e.g. Vlans 701 to 705 and Ten8/1 and Ten8/3 in our example
No flow mask conflict, OK !!
E.g. other possibilities:
FLOWMASK_CONFLICT
FLOWMASK_REDUCED
If conflicting flow mask requirements, traffic on this interface will be sent to software …..Redefine and reapply or deconfigure one or more features to avoid the conflict.
No flow mask conflict, OK !!
No feature conflict, OK !! No feature conflict, OK !! No HW acceleration support for multiple flow based features on the same flow since we do not build an adjacency that will do the combine operation of all the features in one pass.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
32
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 63BRKRST-314314664_05_2008_c2
Detailed Packet Flow TroubleshootingOther Useful Commands
HW/SW inconsistencies: run consistency checker on demand
show mls cef inconsistency now module <module>
Capture all info at once for a particular L3 Prefix:
show platform tech-support unicast <destination> <mask>
Are we running out of L2/L3 Engine Resources (FIB, ACL, Netflow TCAM full …) ?
show platform hardware capacity forwarding
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 64BRKRST-314314664_05_2008_c2
Troubleshooting Unicast Forwarding
Test topology network diagram
Quick sanity checklist (Layer 2/Layer 3)
Detailed L2 packet flow troubleshootingWhich counters and (forwarding) tables to look at
Detailed L3 packet flow troubleshootingWhich counters and (forwarding) tables to look at
Some useful troubleshooting tools
Unicast L2 and L3 traffic: what to check ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
33
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 65BRKRST-314314664_05_2008_c2
HW CEF (distributed CEF) switching
SW CEF switching
Reference: http://www.cisco.com/en/US/products/hw/routers/ps133/products_tech_note09186a00800a70f2.shtml
Which interface(s) on RP are seeing the packets ? DUT#show proc cpu | i CPU
CPU utilization for five seconds: 99%/87%; one minute: 5%; five minutes: 4%
DUT#show interface stats
...
Vlan701
Switching path Pkts In Chars In Pkts Out Chars Out
Processor 1635 124010 1701 128085
Route cache 14431965 1731835800 29499 3539880
Distributed cache 508013546 62991421254 215068498 25808219760
Total 522447146 64723381064 215099698 25811887725
...
DUT#show interface stats | i (^Giga|^Fast|^Port|^Vlan|Processor|Route cache)
DUT#show ibc
Interface information:
Interface IBC0/0(idb 0x4784157C)
Hardware is Mistral IBC (revision 5)
5 minute rx rate 66000 bits/sec, 38 packets/sec
5 minute tx rate 64000 bits/sec, 40 packets/sec 8135 Inband input packet drops
229904 Packets CEF Switched, 28956320 Packets Fast Switched
Potential/Actual paks copied to process level 106180293/108080293(4293067296 dropped,2120 spd drops)
Look at switching statistics on each interface … which interface has high count (and still increasing) for Process and/or SW CEF for incoming packets ?
Process switching
Some Useful Troubleshooting (Tools)
87% interrupts handling = packets hit the CPU, otherwise, look at what process uses CPU
What Packets Are (Not) Hitting the CPU ?
Shorter output (faster to analyze, 3 lines/interface)
(Truncated output)Overall InBand Channel (IBC) traffic statistics: whatever gets punted to CPU goes through IBC; check the rates
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 66BRKRST-314314664_05_2008_c2
Some Useful Troubleshooting Tools
What/why Process (not SW CEF) switched packets are hitting the RP ? DUT#show ip cef switching statistics
DUT#show ip traffic
DUT#show buffers input-interface <interface>
What/why SW CEF (not Process) switched packets are hitting the RP ? DUT#show ip cef summary
DUT#show mls cef summary
DUT#show platform capacity forwarding
DUT#show mls cef exception status detail
DUT#show tcam interface <interface> acl in …
DUT#show fm fie interface <interface> …
monitor session 1 type local
source cpu <rp|sp>
destination interface …
What Packets Are (Not) Hitting the CPU ?
Are we running out of HW resources (HW FIB full), compare amount of entries between SW (IP) and HW (MLS) CEF ? Does the platform HW capacity get exceeded ? Any FIB exceptions ?
Refer to earlier slides: on the “culprit” interface any HW assisted features or SW features enabled … TCAM will be used to point packets to CPU, any flow mask or feature conflicts on the interface ?
Extra tool in 12.2(33)SXH and higher: CPU SPAN can be used to quickly see with sniffer what packets are sent to RP (or SP) CPU , then check the tables (L2/L3 etc. ..)
Information on Process (not SW CEF) switchedpackets; (if no SW CEF, no HW CEF either)
In case of Process (not SW CEF) switching, if input queue is filling up, what packets are in queue ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
34
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 67BRKRST-314314664_05_2008_c2
Some Useful Troubleshooting Tools
Extra tool: debug netdr (use with caution … check with TAC)DUT#debug netdr capture ?
acl (11) Capture packets matching an acl
and-filter (3) Apply filters in an and function: all must match
continuous (1) Capture packets continuously: cyclic overwrite
destination-ip-address (10) Capture all packets matching ip dst address
dstindex (7) Capture all packets matching destination index
ethertype (8) Capture all packets matching ethertype
interface (4) Capture packets related to this interface
or-filter (3) Apply filters in an or function: only one must match
rx (2) Capture incoming packets only
source-ip-address (9) Capture all packets matching ip src address
srcindex (6) Capture all packets matching source index
tx (2) Capture outgoing packets only
vlan (5) Capture packets matching this vlan number
<cr>
Be as specific as possible; on SP, remote login switch, then same set of commands)
Does the CPU Inband Driver See the Packet ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 68BRKRST-314314664_05_2008_c2
Some Useful Troubleshooting Tools
DUT#sh netdr captured-packets
A total of 289 packets have been captured
The capture buffer wrapped 0 times
Total capture capacity: 4096 packets
------- dump of incoming inband packet -------
interface Vl1000, routine mistral_process_rx_packet_inlin
dbus info: src_vlan 0x3E8(1000), src_indx 0x45(69), len 0x40(64)
bpdu 0, index_dir 0, flood 1, dont_lrn 0, dest_indx 0x43E8(17384)
80000401 03E80400 00450000 40800000 E0000000 00000000 00000008 43E80000
mistral hdr: req_token 0x0(0), src_index 0x45(69), rx_offset 0x76(118)
requeue 0, obl_pkt 0, vlan 0x3E8(1000)
destmac FF.FF.FF.FF.FF.FF, srcmac 00.A0.CC.21.94.C4, protocol 0806
layer 3 data: 00010800 06040001 00A0CC21 94C40500 01660000 00000000
05000102 00000000 00000000 00000000 00000000 000001FE
00000006 00000000 000003E8
..
DUT#undebug netdr
DUT#debug netdr clear-capture
E.g.: ARP packet came in on Vlan1000 of RP Inband Driver
Make sure to turn it off afterwards
Make sure to clear memory used up by captured packets
Does the CPU Inband Driver See the Packet ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
35
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 69BRKRST-314314664_05_2008_c2
Troubleshooting Unicast Forwarding
(Some) packets don’t get through (drops, incorrect forwarding)
Checked platform specific counters and tables
Unwanted floodingCheck we learn MAC, L2 tables are in sync
High CPU due to SW path forwardingFinding out (quickly) what packets hit the CPU
Troubleshoot step-by-step, no steps skipping !!
Problems We’ve Looked at
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 70BRKRST-314314664_05_2008_c2
Agenda
Sup720 Architecture (A Quick Look)
Layer 2 and Layer 3 Unicast Troubleshooting
Multicast Troubleshooting
Virtual Switch System Troubleshooting
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
36
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 71BRKRST-314314664_05_2008_c2
Terminology
OIF: Outgoing Interface
OIL: Outgoing Interface List
IGMP: Internet Group Management Protocol
Multicast FIB: Contains the (*,G) and (S,G) entries as well as RPF-VLAN
Adjacency Table: Contains the rewrite information and MET index
LTL: Local Target Logic - forwarding logic for the Catalyst 6500
MET: Multicast Expansion Table - Hardware table that contains the OIFs for for the (*,G) and (S,G) entries
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 72BRKRST-314314664_05_2008_c2
Local Target Logic (LTL)
Every valid packet that ingresses the Catalyst 6500 will be sent to a forwarding engine (FE) within the system (DFC or the PFC on the supervisor)
The FE makes the decision about where to forward the packet or to drop the packet
Part of the result of the forwarding decision is a destination LTL index (or destination index)
The destination index is used to select the physical port(s) that will forward the packet
For multicast, another important part of the forwarding decision is the MET index
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
37
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 73BRKRST-314314664_05_2008_c2
Multicast Expansion Table (MET)
The MET is memory where the list of OIFs for the multicast entries are stored
Each replication engine in the chassis has a separate MET
Read using the MET index from the CEF adjacency
MET block contains the list of OIFs and the corresponding destination LTL index
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 74BRKRST-314314664_05_2008_c2
Multicast Expansion Table (MET)
100 0x942
OIF VLAN LTL
101 0x943
102 0x945
100 0x960
1019 0x961
4030 0x920
4031 0x921
4032 0x933
700 0x919
Index 0x26 from ADJ MET Block
Index 0x8A from ADJ
Index 0x8B from ADJ
EntryID
0x26
0x8A
0x8B
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
38
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 75BRKRST-314314664_05_2008_c2
Multicast Replication
Replication: Process of creating copies of packets
L2 Replication: Creating copies of a packet within a single VLAN
(e.g., Forwarding a single broadcast packet out all ports within a VLAN)
Does not require a replication engine
L3 Replication: Creating copies of a multicast packet for forwarding out each of the interfaces in an OIL.
Requires a replication engine
For this multicast discussion, the term Replication will mean L3 Replication
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 76BRKRST-314314664_05_2008_c2
Multicast Replication Modes
Replication mode refers to where in the system multicast replication occurs
In classic system, replication always occurs centrally on the supervisor engine
In a fabric-enabled system, there are two possible replication modes:
Ingress replication mode
Egress replication mode
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
39
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 77BRKRST-314314664_05_2008_c2
Ingress Replication Mode
Replication engine on ingress module performs replication for all OIFs
One copy of the original packet is forwarded across the fabric for each of the OIFs
All fabric enabled modules are capable of ingress mode
Only 6516A and 6700 series modules are capable of egress mode
System will default to ingress mode when at least one module not capable of egress mode is present
Packets ingress on a module without a replication engine (classic module) will be replicated by the supervisor’s replication engine
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 78BRKRST-314314664_05_2008_c2
Egress Replication Mode
Each Replication engine performs replication for local OIFs* only
Only a single copy of the original packet is sent across the fabric for all interfaces in the OIL
Requirements:1. Supervisor Engine 720
2. All cards in chassis must be 6700 series or 6516A
System will default to egress mode whenever possible
System can be forced to either mode with the command mls ip multicast replication-mode [ingress|egress]
* Local OIF: Any OIF local to the replication engine.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
40
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 79BRKRST-314314664_05_2008_c2
RRRRRR
RR
Local OIF Example
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
Card 1 Card 2
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 80BRKRST-314314664_05_2008_c2
Diagram for troubleshooting example
Gi5/2VLAN 10
Gi9/1VLAN 20
Receiver10.10.10.100
Source172.16.25.1
Gi9/4VLAN 20
Receiver20.20.20.100
Group225.10.10.10
L3 Network
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
41
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 81BRKRST-314314664_05_2008_c2
Packet Walk Components
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
PortASIC
SPCPU
RPCPU
PFC/Switching Engine:L2 engine: L2 lookupsL3 engine: L3 FIB & Adjlookups; NetFlow lookups; RACL, VACL & QoS lookups
Module 1 Module 2
Centralized Forwarding Card (CFC):Serves as BUS ASIC. Transmits & receives packets from the EARL-DBUS and EARL-RBUS
Replication Engine/Fabric ASIC: Transmits & receives packets from the switch fabric. Responsible for SPAN and multicast replication. Performs all MET lookups using indices from L3 lookups. Does packet rewrites for packets sent across the fabric
Replication Engine/Fabric ASIC (Supervisor): Transmits & receives packets from the switch fabric. Responsible for SPAN and multicast replication. Performs all MET lookups using indices from L3 lookups. Does packet rewrites for packets sent across the fabric. Also serves as BUS ASIC
Port ASIC: Handles packets to and from physical interfaces. Applies and removes any trunking encap/tags, applies ingress and egress QoS
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 82BRKRST-314314664_05_2008_c2
Ingress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
Module 1 Module 2
L3Engine
Fabric ASIC &
ReplicationEngine
SPCPU
RPCPU
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
PortASIC
EARL-DBUS
EARL-RBUS
3. BUS ASIC sends DBUS packet over EARL-DBUS
4. DBUS ASIC on module 2 receives packet and discards it.
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
6. L2 engine performs L2 lookup in ingressVLAN and forwards headers to L3 engine
7. L3 engine performs ACL, VACL and QoS lookup in ingress VLAN and performs an RPF check
5. Supervisor DBUS ASIC receives DBUS packet and accepts it and forwards to PFC
2. Port ASIC sends to fabric ASIC
RR
RR
SS RRRRRR
RR
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
1. Source S sends packets in VLAN O
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
42
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 83BRKRST-314314664_05_2008_c2
Ingress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
SS RRRR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR
RR
SPCPU
RPCPU
8. L3 engine returns result to L2 engine. Result contains LTL index for forwarding in the ingress VLAN as well as indices for MET lookup
9. L2 engine sends final result over LC-RBUS to Fabric ASIC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
10. Fabric ASIC on Supervisor forwards result onto E-RBUS
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
Module 1 Module 2
11. ASIC originating the DBUS packet accepts the result, all others discard
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 84BRKRST-314314664_05_2008_c2
RRRR
Ingress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
SS RRRR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
PortASIC
EARL-DBUS
EARL-RBUS
SPCPU
RPCPU
13. Switch Fabric uses the FPOE in the fabric packet and forwards only to channels that have receivers or mrouters in the ingress VLAN (VLAN O)
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
Module 1 Module 2
14. Fabric ASIC on egress module receives packet from the switch fabric and forwards to port ASIC
12. Fabric ASIC on ingress card rewrites the packet according to the result, builds a fabric packet containing the rewritten packet and the result and forwards to the fabric
15. Port ASIC receives the rewritten packet and result and forwards to receiver in the ingress VLAN
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
43
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 85BRKRST-314314664_05_2008_c2
RRRR
RR
RR
SS RRRR
Ingress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
PortASIC
EARL-DBUS
EARL-RBUS
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC17. RE makes one copy of the original packet for each of the OIF’s in the OIL and sends a corresponding DBUS packet for each to the switching engine over the DBUS (only showing one packet here for brevity)
19. L3 engine receives appropriate headers from the L2 engine and performs egressACL, VACL and QoS lookup.
18. L2 engine does no lookup and forwards appropriate headers to L3 engine
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
Module 1 Module 2
16. Replication Engine (RE) performs a lookup in the MET using the MET indices from the result received in step 11 to get all of the OIF’s in the OIL
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 86BRKRST-314314664_05_2008_c2
RRRR
RR
RR
SS RRRR
Ingress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
PortASIC
EARL-DBUS
EARL-RBUS
SPCPU
RPCPU
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
21. BUS ASIC forwards result onto EARL-RBUS
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
Module 1 Module 2
20. L3 engine forwards result to L2 engine and L2 engine forwards result onto LC-RBUS
22. ASIC originating the DBUS packets accepts the results, all others discard
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
44
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 87BRKRST-314314664_05_2008_c2
RR
RR
RRRRRR
RRSS
Ingress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
PortASIC
EARL-DBUS
EARL-RBUS
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
24. Switch Fabric uses the FPOE in the fabric packet and forwards only to channels that have receivers or mrouters in the egress VLANs
Module 1 Module 2
25. Fabric ASICs on egress modules receive packet from the switch fabric and forward to port ASIC
23. Fabric ASIC on ingress card rewrites the packets according to the results, builds a fabric packet for each containing the rewritten packet and the result and forwards to the fabric
26. Port ASIC receives the rewritten packet and result and forwards to receiver in the egress VLAN
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 88BRKRST-314314664_05_2008_c2
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
SSRRRR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR
RR RRPortASIC
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
1. Source sends in VLAN O
2. Port ASIC sends to fabric ASIC
3. Fabric ASIC sends DBUS packet over EARL-DBUS
4. DBUS ASIC on module 2 receives packet and discards it.
6. DBUS packet forwarded to L2 engine for L2 lookup
7. L3 engine performs lookup using the primary CEF entry. L3 engine also does ACL, VACL and QoSlookup in ingressVLAN and RPF check
5. Supervisor DBUS ASIC receives DBUS packet and accepts it
Module 1 Module 2
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
45
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 89BRKRST-314314664_05_2008_c2
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RRRR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR
RRSS
RRPortASIC
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
8. L3 engine returns result to L2 engine. Result contains LTL index for forwarding in the ingress VLAN as well as indices for MET lookup
9. L2 engine sends final result over LC-RBUS to Fabric ASIC
11. ASIC originating the DBUS packet accepts the result, all others discard
10. Fabric ASIC on Supervisor forwards result onto E-RBUS
Module 1 Module 2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 90BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
13. Switch Fabric uses the FPOE in the fabric packet and forwards only to channels that have receivers or mrouters in the ingress VLAN (VLAN O)
Module 1 Module 2
14. Fabric ASIC on egress module receives packet from the switch fabric and forwards to port ASIC
12. Fabric ASIC on ingress card rewrites the packet according to the result, builds a fabric packet containing the rewritten packet and the result and forwards to the fabric
15. Port ASIC receives the rewritten packet and result and forwards to receiver in the ingress VLAN
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
46
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 91BRKRST-314314664_05_2008_c2
RRRR
RR
RR
RRRR SS
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
PortASIC
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
17. RE copies the original packet onto VLAN G and Fabric ASIC sends DBUS packet to the switching engine for egress processing
19. L3 engine receives appropriate headers from the L2 engine and performs ACL, VACL and QoSlookup for VLAN G. Result is forwarded to L2 engine
18. L2 engine performs no lookup and forwards appropriate headers to L3 engine
Module 1 Module 2
16. RE does a lookup in the MET using the MET3 index from the result received in step 11 to get all of the OIF’s in the OIL that are local to this RE. There is one receiver in VLAN G local to the RE
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 92BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
20. Fabric ASIC receives result & forwards copy of packet and result to port ASIC
21. Port ASIC forwards packet to receiver in VLAN G based on destination index in the result
Module 1 Module 2
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
47
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 93BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
23. RE copies the packet onto the egress replication VLAN and Fabric ASIC & sends a corresponding DBUS packet to the switching engine.
24. L2 lookup indicates bridging to all other modules with receivers and flagging the packet so that it’s replicated by the receiving module
25. No RACL, VACL or QoSlookups done on this packet. Result is forwarded to L2 engine
Module 1 Module 2
22. RE does a second lookup in the MET using the MET2 index from the previous result. This will yield an egress replication VLAN ID and a destination index.
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 94BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
26. Result received by fabric ASIC on the module 1. All others discard result
Module 1 Module 2
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
48
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 95BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
27. Fabric ASIC sets FPOE to forward to all cards with local receivers or mrouters in the egress VLANs and sends the packet and result to the fabric in the egress replication VLAN
28. Switch Fabric uses the FPOE in the fabric packet to forward only to channels that have localreceivers or mrouters in the on any OIF
29. Fabric ASICs on egress modules receive the packet on the internal replication VLAN and hand packet over to the RE
Module 1 Module 2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 96BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
31. L2 engine recognizes packet is flagged for egress replication and forwards headers to L3 engine
32. L3 engine performs CEF lookup using secondary entry. Lookup yields MET index for replication to all local OIFs. Result is forwarded to the L2 engine
Module 1 Module 230. Fabric ASIC that received the packet on the internal replication VLAN sends packet to the forwarding engine for CEF lookup
Note: Steps 30 - 32 are repeated for each of the fabric ASICs that received the packet on the internal replication VLAN. Each needs the result of the CEF lookup (i.e., the index for the MET lookup to get the OIL for all the local receivers and mrouters)
Note: Steps 30 - 32 are repeated for each of the fabric ASICs that received the packet on the internal replication VLAN. Each needs the result of the CEF lookup (i.e., the index for the MET lookup to get the OIL for all the local receivers and mrouters)
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
49
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 97BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
33. Result received by fabric ASIC on the module 1. All others discard result
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 98BRKRST-314314664_05_2008_c2
RRRR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
RR SS
RRPortASIC
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
34. RE performs a MET lookup using the MET index from the result and replicates packet onto VLAN B
36. L2 engine performs L2 lookup in egress VLAN (VLAN B) and forwards headers to L3 engine 37. L3 engine
performs RACL, VACL and QoSlookups for egress VLAN and forwards result to L2 engine
Module 1 Module 2
35. Packet is forwarded over the DBUS to the forwarding engine for an egress lookup.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
50
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 99BRKRST-314314664_05_2008_c2
RRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
39. Fabric ASIC forwards a copy of packet and result to port ASIC
38. Result received by fabric ASIC on the module 1. All others discard result
40. Port ASIC forwards packet to receiver in VLAN B based on destination index in the result
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 100BRKRST-314314664_05_2008_c2
RRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
41. RE on module 2 performs a MET lookup using the MET index from the result and replicates the packet onto VLAN P
43. L2 engine performs L2 lookup in egress VLAN (VLAN P) and forwards headers to L3 engine 44. L3 engine
performs RACL, VACL and QoSlookups for egress VLAN and forwards result to L2 engine
42. Packet is sent to the forwarding engine over the DBUS for an egress lookup.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
51
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 101BRKRST-314314664_05_2008_c2
RRRRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
46. Fabric ASIC forwards a copy of packet and result to port ASIC
45. Result received by fabric ASIC on the module 1. All others discard result
47. Port ASIC forwards packet to receiver in VLAN P based on destination index in the result
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 102BRKRST-314314664_05_2008_c2
RRRRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
48. RE on supervisor performs a MET lookup using the MET index from the result and replicates packet onto VLAN G
50. L2 engine performs L2 lookup in egress VLAN (VLAN G) and forwards headers to L3 engine 51. L3 engine
performs RACL, VACL and QoSlookups for egress VLAN and forwards result to L2 engine
49. Packet is forwarded over the LC-DBUS to the forwarding engine for an egress lookup.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
52
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 103BRKRST-314314664_05_2008_c2
RRRRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
53. Fabric ASIC forwards a copy of packet and result to port ASIC
52. Result received by fabric ASIC on the supervisor. Result is not sent over the EARL-RBUS54. Port ASIC
forwards packet to receiver in VLAN G based on destination index in the result
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 104BRKRST-314314664_05_2008_c2
RRRRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
55. RE on supervisor performs a MET lookup using the MET index from the result and replicates packet onto VLAN P
57. L2 engine performs L2 lookup in egress VLAN (VLAN P) and forwards headers to L3 engine 58. L3 engine
performs RACL, VACL and QoSlookups for egress VLAN and forwards result to L2 engine
56. Packet is forwarded over the DBUS to the forwarding engine for an egress lookup.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
53
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 105BRKRST-314314664_05_2008_c2
RRRRRR
RR
Egress Replication Packet Walk
PFCSupervisor
Switch Fabric
LC-DBUSLC-RBUS
RR
RR
L3Engine
Fabric ASIC &
ReplicationEngine
L2Engine
PortASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
CFC
Replication Engine &
Fabric ASIC
Replication Engine &
Fabric ASIC
PortASIC
EARL-DBUS
EARL-RBUS
SS
RRPortASIC
SPCPU
RPCPU
VLAN B
VLAN OVLAN O
VLAN O
VLAN P
VLAN G
VLAN P
VLAN G
Module 1 Module 2
53. Fabric ASIC forwards a copy of packet and result to port ASIC
52. Result received by fabric ASIC on the module 1. All others discard result
54. Port ASIC forwards packet to receiver in VLAN P based on destination index in the result
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 106BRKRST-314314664_05_2008_c2
Verify L1/L2
Use…Show interfaces
Show interfaces counters
Show interfaces counters errors
and look for any physical layer errors
Follow the L2 troubleshooting steps from the previous section IP unicast troubleshooting
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
54
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 107BRKRST-314314664_05_2008_c2
Diagram for troubleshooting example
Gi5/2VLAN 10
Gi9/1VLAN 20
Receiver10.10.10.100
Source172.16.25.1
Gi9/4VLAN 20
Receiver20.20.20.100
Group225.10.10.10
L3 Network
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 108BRKRST-314314664_05_2008_c2
Use show ip igmp groups [group] to verify that the receivers’membership reports are received by the switch
Cat6K#show ip igmp groups 225.10.10.10
IGMP Connected Group Membership
Group Address Interface Uptime Expires Last Reporter
225.10.10.10 Vlan20 00:17:08 00:02:10 20.20.20.100
225.10.10.10 Vlan10 3d04h 00:02:30 10.10.10.100
Receiving IGMP membership reports ?
Shows both receivers in the correct VLANs
Note: The output only shows the last reporter, so a given host may not show up in the output if there are other receivers on the same interface. Make sure that the OIF shows up in the interface column.
Note: The output only shows the last reporter, so a given host may not show up in the output if there are other receivers on the same interface. Make sure that the OIF shows up in the interface column.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
55
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 109BRKRST-314314664_05_2008_c2
Use show mac-address-table multicast igmp-snooping to display the IGMP Snooping L2 forwarding table
Cat6K#show mac-address-table multicast igmp-snooping
vlan mac address type learn qos ports
-----+---------------+--------+-----+---+--------------------------
20 0100.5e0a.0a0a static Yes - Gi9/1,Gi9/4,Router
10 0100.5e0a.0a0a static Yes - Gi5/2,Router
20 0100.5e00.0127 static Yes - Gi9/1,Gi9/4,Router
10 0100.5e00.0127 static Yes - Router
10 0100.5e00.0128 static Yes - Router
20 0100.5e00.0128 static Yes - Gi9/1,Gi9/4,Router
Is IGMP Snooping actually snooping ?
Gi9/1 is the incoming interface also an mrouter port
Gi9/4 contains receiver 20.20.20.100 in VLAN 20
Gi5/2 contains receiver 10.10.10.100 in VLAN 10
Router indicates Router indicates that the MSFC is a router port
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 110BRKRST-314314664_05_2008_c2
Use show ip route [source] to identify ingress or RPF interface for the multicast traffic
Cat6K#show ip route 172.16.25.1
Routing entry for 172.16.25.0/24
Known via "ospf 100", distance 110, metric 2, type inter area
Last update from 20.20.20.2 on Vlan20, 02:15:43 ago
Routing Descriptor Blocks:
* 20.20.20.2, from 20.20.20.2, 02:15:43 ago, via Vlan20
Route metric is 2, traffic share count is 1
Correct reverse path back to the source?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
56
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 111BRKRST-314314664_05_2008_c2
Use show ip mroute [group] [source] to verify that an mroute entry exists
Cat6K#show ip mroute 225.10.10.10IP Multicast Routing TableFlags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,T - SPT-bit set, J - Join SPT, M - MSDP created entry,X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,U - URD, I - Received Source Specific Host Report, Z - Multicast TunnelY - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winnerTimers: Uptime/ExpiresInterface state: Interface, Next-Hop or VCD, State/Mode
(*, 225.10.10.10), 01:21:15/00:02:55, RP 100.100.100.100, flags: SJCIncoming interface: Null, RPF nbr 0.0.0.0Outgoing interface list:
Vlan20, Forward/Sparse-Dense, 00:08:28/00:02:55Vlan10, Forward/Sparse-Dense, 01:21:15/00:02:10
(172.16.25.1, 225.10.10.10), 01:21:15/00:02:50, flags: TIncoming interface: Vlan20, RPF nbr 20.20.20.2, RPF-MFDOutgoing interface list:
Vlan10, Forward/Sparse-Dense, 01:21:15/00:02:10, H
Does (S,G) exist? Is it installed in HW?
RPF-MFD - Reverse Path Forwarding-Multicast Fast Drop: when a multicast entry is installed in the hardware, the entry is flagged with the RPF-MFD flag. This flag ensures that multicast traffic that is switched within a VLAN and non-rpf traffic are not bridged to the RP.
H-Flag: multicast entry is installed in hardware
(S,G)
RPF VLAN
OIL
RPF neighbor
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 112BRKRST-314314664_05_2008_c2
Use show ip mroute [source] [group] count to verify packets are being forwarded for the mroute entry
Cat6K#show ip mroute 225.10.10.10 count
IP Multicast Statistics
5 routes using 3620 bytes of memory
3 groups, 0.66 average sources per group
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)
Group: 225.10.10.10, Source count: 1, Packets forwarded: 350, Packets received: 350
RP-tree: Forwarding: 0/0/0/0, Other: 0/0/0
Source: 172.16.25.1/32, Forwarding: 350/1/975/2, Other: 350/0/0
Are mcast packets being forwarded?
Make sure that forwarding packet counts are incrementing
Make sure that drops are not incrementing
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
57
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 113BRKRST-314314664_05_2008_c2
Use the show mls ip multicast group [group]command to verify a hardware entry exists for the group and that packets are being forwarded for that entry.
Cat6K#show mls ip multicast group 225.10.10.10
Multicast hardware switched flows:
(172.16.25.1, 225.10.10.10) Incoming interface: Vlan20, Packets switched: 361
Hardware switched outgoing interfaces:
Vlan10
RPF-MFD installed
Are they forwarded in HW?
Verifies RPF interface and shows that packets are being switched and the correct OIF
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 114BRKRST-314314664_05_2008_c2
Which forwarding mode is being used?
Use show mls ip multicast capability to show which forwarding mode is being used.
Cat6K#show mls ip multicast capabilityCurrent mode of replication is IngressConfigured replication mode is Auto
Slot Multicast replication capability5 Egress9 Ingress
Shows that the global mode is Ingress
One card in the chassis only capable of ingress mode cause the mode to move to ingress
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
58
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 115BRKRST-314314664_05_2008_c2
Use show mls cef ip multicast group [group] to get adjacency pointer for (S,G) entry in hardware. Also note the rewrite index, met3 index and LTL indices for OIF’s
Cat6K#rem comm sw show mls cef ip multicast group 225.10.10.10 detail
Multicast CEF Entries for VPN#0(172.16.25.1, 225.10.10.10)
IOSVPN:0 (1) PI:1 (1) CR:0 (1) Recirc:0 (1)Vlan:20 AdjPtr:30 FibRpfNf:1 FibRpfDf:1 FibAddr:0x30080rwvlans:20 rwindex:0x9BD adjmac:001d.a29a.1f00 rdt:1 E:0 CAP1:0fmt:Mcast l3rwvld:1 DM:0 mtu:1518 rwtype:L3 met2:0x0 met3:0x26packets:0000000000063 bytes:000000000000007434Starting Offset: 0x0026V E C: 10 I:0x009BF
Is there a CEF entry in the HW?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 116BRKRST-314314664_05_2008_c2
A closer look at the CEF entry
Multicast CEF Entries for VPN#0
(172.16.25.1, 225.10.10.10)
IOSVPN:0 (1) PI:1 (1) CR:0 (1) Recirc:0 (1)
Vlan:20 AdjPtr:30 FibRpfNf:1 FibRpfDf:1 FibAddr:0x30080
rwvlans:20 rwindex:0x9BD adjmac:001d.a29a.1f00 rdt:1 E:0 CAP1:0
fmt:Mcast l3rwvld:1 DM:0 mtu:1518 rwtype:L3 met2:0x0 met3:0x26
packets:0000000000063 bytes:000000000000007434
Starting Offset: 0x0026
V E C: 10 I:0x009BF
LTL index used to forward to all receivers and mrouters in the ingress VLAN
MET index used to derive the LTL indices to forward to all receivers and mroutersin all of the OIF’s in the OILResult of the MET lookup
using the met3 index. We have only one OIF: VLAN 10. 0x9BF is the LTL index used to forward to all receivers and mrouters in VLAN 10
Egress VLAN
Pointer to location in adjacency table where rewrite, LTL index and MET indices are stored
RPF VLAN
(S,G)
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
59
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 117BRKRST-314314664_05_2008_c2
Check output of show mls cef adjacency using pointer address from previous command to verify consistency in rewrite index andmet3 index
Cat6K#show mls cef adjacency multicast detail | begin 30Index: 30 smac: 001d.a29a.1f00, dmac: 0000.0000.0000
mtu: 1518, vlan: 20, dindex: 0x9BD, l3rw_vld: 1format: MULTICAST, flags: 0x2608 met2: 0, met3: 38packets: 84, bytes: 9912
Check adjacency for consistency
Same as rwindex from show mls cef ipmulticast command
Same as met3 index from show mls cef ipmulticast command. Output here is in decimal format
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 118BRKRST-314314664_05_2008_c2
Use test platform mcast ltl index [index] to verify correct forwarding ports in the ingress VLAN and all OIF’s
Cat6K#rem comm sw test mcast ltl index 9bd
index 0x9BD contain ports 5/T1, 9/1,4,T1,T2
Cat6K#rem comm sw test mcast ltl index 9bf
index 0x9BF contain ports 5/2,T1, 9/T1,T2
Do the indices point to the correct OIF’s?
Gi9/1 is an mrouter port in the ingress VLAN and Gi9/4 is a receiver port in the ingress VLAN
Gi5/2 is a receiver port in the egress VLAN, VLAN 10
The Tn (n=1,2,…) entries refer to the replication engine on the module specified
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
60
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 119BRKRST-314314664_05_2008_c2
Egress mode troubleshooting
In egress mode you will have two CEF entries: a primary entry PI:1 and a non-primary or secondary entry PI:0
Cat6K#rem comm sw show mls cef ip multicast group 225.10.10.10 detail
Multicast CEF Entries for VPN#0(172.16.25.1, 225.10.10.10)
IOSVPN:0 (1) PI:1 (1) CR:0 (1) Recirc:0 (1)Vlan:20 AdjPtr:30 FibRpfNf:1 FibRpfDf:1 FibAddr:0x30080rwvlans:20 rwindex:0x939 adjmac:001d.a29a.1f00 rdt:1 E:0 CAP1:0fmt:Mcast l3rwvld:1 DM:0 mtu:1518 rwtype:L2&L3 met2:0x8A met3:0x8Bpackets:0000000000049 bytes:000000000000005782Starting Offset: 0x008A
V E L0 C:1015 I:0x0080B Starting Offset: 0x008B
V E C: 10 I:0x0091B
IOSVPN:0 (1) PI:0 (1) CR:1 (1) Recirc:0 (1)Vlan:1015 AdjPtr:65536 FibRpfNf:0 FibRpfDf:1 FibAddr:0x30082rwvlans:1015 rwindex:0x7FFA adjmac:001d.a29a.1f00 rdt:1 E:0 CAP1:0fmt:Mcast l3rwvld:1 DM:0 mtu:1518 rwtype:L3 met2:0x0 met3:0x8Bpackets:0000000000000 bytes:000000000000000000Starting Offset: 0x008B
V E C: 10 I:0x0091B
Primary entry
secondary entry
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 120BRKRST-314314664_05_2008_c2
Egress mode troubleshooting
The primary entry is used by the ingress forwarding engine for:
Forwarding to all receivers & mrouters in the ingress VLAN
Forwarding to all “local” receivers & mrouters on all OIF’s in the OIL
The non-primary entry is used by the egress forwarding engines for:
Forwarding to all “local” receivers & mrouters on all OIF’s in the OIL
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
61
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 121BRKRST-314314664_05_2008_c2
A closer look at the primary entry
(172.16.25.1, 225.10.10.10)IOSVPN:0 (1) PI:1 (1) CR:0 (1) Recirc:0 (1)Vlan:20 AdjPtr:30 FibRpfNf:1 FibRpfDf:1 FibAddr:0x30080rwvlans:20 rwindex:0x939 adjmac:001d.a29a.1f00 rdt:1 E:0 CAP1:0fmt:Mcast l3rwvld:1 DM:0 mtu:1518 rwtype:L2&L3 met2:0x8A met3:0x8Bpackets:0000000000049 bytes:000000000000005782Starting Offset: 0x008A
V E L0 C:1015 I:0x0080BStarting Offset: 0x008B
V E C: 10 I:0x0091B
LTL index used to forward to all receivers and mrouters in the ingress VLAN
MET index used to retrieve the LTL indices for receivers and mrouters local to the ingress replication engine. One LTL index per OIF in the OIL
MET index used to retrieve the egress replication VLAN and the LTL index used to forward a single copy of the multicast packet across the fabric in the egress replication VLAN
The primary entry
Met2 lookup result. Shows egress replication VLAN 1015 and LTL index 0x80B
Met3 lookup result. Show egress VLAN 10 and LTL index 0x91B
Egress VLAN
met2 block
met3 block
RPF VLAN
(S,G)
Packet and byte counts should increment with packets forwarded
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 122BRKRST-314314664_05_2008_c2
A closer look at the non-primary entry
IOSVPN:0 (1) PI:0 (1) CR:1 (1) Recirc:0 (1)
Vlan:1015 AdjPtr:65536 FibRpfNf:0 FibRpfDf:1 FibAddr:0x30082
rwvlans:1015 rwindex:0x7FFA adjmac:001d.a29a.1f00 rdt:1 E:0 CAP1:0
fmt:Mcast l3rwvld:1 DM:0 mtu:1518 rwtype:L3 met2:0x0 met3:0x8B
packets:0000000000000 bytes:000000000000000000
Starting Offset: 0x008B
V E C: 10 I:0x0091B
The non-primary entry Egress replication VLAN 1015
LTL index used to get the packet to the replication engine on the egress module
MET index used by the egress replication engine to retrieve the LTL indices for receivers and mrouters local to the egress replication engine. One LTL index per OIF in the OIL
Met3 lookup result. Shows egress VLAN 10 and LTL index 0x91B
RPF VLAN
Packet and byte counts will always be zero on secondary entry
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
62
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 123BRKRST-314314664_05_2008_c2
Egress Mode Troubleshooting
Can check for consistency by reading the MET directly with test mcast rd-met command
Cat6K#rem comm sw test mcast rd-met slot 9 addr 8a end 8b
Met 0x008A V E L0 C: 1015 I: 0x0080BMet 0x008B V E C: 10 I: 0x0091B
***The slot number will be the slot with the ingress replication engine if looking at the primary entry and the egress slot if looking at the non-primary entry
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 124BRKRST-314314664_05_2008_c2
Egress mode troubleshooting
Use test platform mcast ltl index [index] to verify correct forwarding ports for both entries
Cat6K#rem comm sw test mcast ltl index 80b
index 0x80B contain ports 5/T1
Cat6K#rem comm sw test mcast ltl index 91b
index 0x91B contain ports 5/2
contains only the replication engine on the egress module
contains only the port on the egress module where the receiver in VLAN 10 lives
What about the receiver in the ingress VLAN 20 on Gi9/4?Remember, that’s a different LTL index.
Cat6K#rem comm sw test mcast ltl index 939
index 0x939 contain ports 5/T1, 9/1,4,T1,T2
Shows Gi9/4 and Gi9/1 which is an mrouter port in VLAN 20
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
63
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 125BRKRST-314314664_05_2008_c2
What if there are DFCs in the system?
In ingress replication mode, all lookups are performed by the DFC on the ingress moduleIn egress replication mode:
Lookups for the original packet and all those replicated by the ingress replication engine are performed by the DFC on the ingress moduleLookups for all packets replicated by the egress replication engine are performed by the DFC on the egress module
For the supervisor and modules without a DFC, lookups are performed by the PFC on the active supervisorTroubleshooting is the same as outlined above and show commands are the same as shown, however…Instead of remote command switch use remote command module [slot#]
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 126BRKRST-314314664_05_2008_c2
Use show platform tech-support ipmulticast [group] [source]Cat6K#show platform tech-support ipmulticast 225.10.10.10 172.16.25.1show versionshow running-configshow interface Vlan20 countersshow ip igmp group 225.10.10.10show ip igmp interface Vlan20show ip mroute 225.10.10.10show ip mroute 225.10.10.10 countshow mls ip multicast group 225.10.10.10 source 172.16.25.1show mls ip multicast connectedshow mls ip multicast rp-mappingremote command switch show mac address 0100.5e0a.0a0a vlan 20remote command switch show mmls igmp processremote command switch show mls cef ip multicast source 172.16.25.1 group 225.10.10.10 detailremote command switch show table cbl slot 5 vlan 20remote command switch show table cbl slot 9 vlan 20remote command switch show table fpoe slot 5 start 0x938 end 0x939remote command switch show table fpoe slot 5 start 0x938 end 0x939 swremote command switch show table fpoe slot 5 start 0x80B end 0x80Bremote command switch show table fpoe slot 5 start 0x80B end 0x80B swremote command switch show table cbl slot 5 vlan 10remote command switch show table fpoe slot 5 start 2331 end 2331remote command switch show table fpoe slot 5 start 2331 end 2331 swremote command switch test mcast ltl index 938remote command switch test mcast ltl index 939remote command switch test mcast ltl index 91B
----- AND MORE -----
Show Platform Tech ipmulticast
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
64
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 127BRKRST-314314664_05_2008_c2
Agenda
Sup720 Architecture (A Quick Look)
Layer 2 and Layer 3 Unicast Troubleshooting
Multicast Troubleshooting
Virtual Switch System Troubleshooting
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 128BRKRST-314314664_05_2008_c2
VSS Specific Troubleshooting
VSS test topology network diagram
VSS system control plane debugs
VSS specific L2/L3 packet flow troubleshootingWhich counters and (forwarding) tables to look at
Some useful troubleshooting tools
VSS: What to Check ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
65
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 129BRKRST-314314664_05_2008_c2
Po2Gig5/9 Gig2/6/12
Gig1/9/36Gig4/16Gig2/9/15Gig2/2Gig1/6/2Gig5/2Gig1/5/1Gig2/4
Po1
VSS Test Topology Network Diagram
DUT is the Device Under Test we are troubleshooting
DUT is a Virtual Switch System, consisting of 2 6509’s with supervisor VS-S720-10G-3C(XL)
R1/R2 are neighboring devices
Connections are respectively a 5 x 1 Gigabit Ethernet Port Channel and 2 x 1 Ten Gigabit links,
Running equal cost multi path routing with respectively 5 (Vlan 701 to 705) and 2 (Ten8/1 and Ten8/3) equal cost paths
R1DUT
R28.0.1.1 9.0.1.2
Ten1/3/2 Ten1/1
Ten2/2/7 Ten1/4
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 130BRKRST-314314664_05_2008_c2
VSS System Control Plane Debugs
Virtual Switch Link (VSL) is special port channel required to bundle 2 physical switches into 1 virtual switch
VSL Protocol (VSLP) runs between active and standby switch over the VSL, and has 2 components:
Link Maintenance Protocol (LMP): runs over each individual link in VSL bundle
Role Resolution Protocol (RRP): runs on each side of the VSL port channel between the 2 physical switches
Enhanced PAgP: PAgP protocol enhanced with extra Type-Length-Value (TLV) fields
VSS Specific Protocols Overview
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
66
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 131BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsDUT#show switch virtual
Switch mode : Virtual Switch
Virtual switch domain number : 1
Local switch number : 1
Local switch operational role: Virtual Switch Active
Peer switch number : 2
Peer switch operational role : Virtual Switch Standby
DUT#show switch virtual link port-channel
Flags: D - down P - bundled in port-channel
. . .
Group Port-channel Protocol Ports
------+-------------+-----------+-------------------
256 Po256(RU) - Te1/3/3(P) Te1/3/4(P) Te1/3/6(P)
Te1/5/4(P)
255 Po255(RU) - Te2/2/3(P) Te2/2/6(P) Te2/2/8(P)
Te2/5/4(P)
DUT#show switch virtual role
Switch Switch Status Preempt Priority Role Session ID
Number Oper(Conf) Oper(Conf) Local Remote
------------------------------------------------------------------
LOCAL 1 UP TRUE (Y*) 200(200) ACTIVE 0 0
REMOTE 2 UP TRUE (Y*) 100(100) STANDBY 2977 3643
Standby configured preempt timer(switch 2): 5 minutes
Active configured preempt timer(switch 1): 5 minutes
In dual-active recovery mode: No
VSS Quick Configuration Sanity Check
Switch id 1 is active, 2 is standby, both are up
Check status for each link in VSL port channel is P
Interfaces identified by <switchNr>/<modNr>/<portNr>
Unique domain number for each VSS
Switch id 1 side of the VSL
Switch id 2 side of the VSL
Switch is not in dual active recovery mode
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 132BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsDUT#show switch virtual link
VSL Status : UP
VSL Uptime : 18 hours, 12 minutes
VSL SCP Ping : Pass
VSL ICC Ping : Pass
VSL Control Link : Te1/5/4
DUT#show switch virtual link port
LMP summary
Link info: Configured: 4 Operational: 4
Peer Peer Peer Peer Timer(s)running
Interface Flag State Flag MAC Switch Interface (Time remaining)
--------------------------------------------------------------------------------
Te1/5/4 vfs operational vfs 0011.bc75.4400 2 Te2/5/4 T4(220ms)
T5(175s)
Te1/3/3 vfs operational vfs 0011.bc75.4400 2 Te2/2/6 T4(220ms)
T5(175s)
Te1/3/4 vfs operational vfs 0011.bc75.4400 2 Te2/2/8 T4(220ms)
T5(175s)
Te1/3/6 vfs operational vfs 0011.bc75.4400 2 Te2/2/3 T4(768ms)
T5(175s)
Flags: v - Valid flag set f - Bi-directional flag set
s - Negotiation flag set
Timers: T4 - Hello Tx Timer T5 - Hello Rx Timer
VSS: Looking at LMP
Carries EOBC and IBC control messages (SCP and ICC/IPC)
How long are we up … did VSL go down ?
Check LMP state and Flags (vf) of the links in the VSL bundle
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
67
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 133BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsDUT#show switch virtual link port (continued)
. . .
LMP Status
Last operational Current packet Last Diag Time since
Interface Failure state State Result Last Diag
-------------------------------------------------------------------------------
Te1/5/4 No failure Hello bidir Never ran --
Te1/3/3 No failure Hello bidir Never ran --
Te1/3/4 No failure Hello bidir Never ran --
Te1/3/6 No failure Hello bidir Never ran --
LMP hello timer <- LMP timer values
Hello Tx (T4) ms Hello Rx (T5*) ms
Interface State Cfg Cur Rem Cfg Cur Rem
-------------------------------------------------------------------------
Te1/5/4 operational 5000 5000 220 180000 180000 175548
Te1/3/3 operational 5000 5000 220 180000 180000 175548
Te1/3/4 operational 5000 5000 220 180000 180000 175548
Te1/3/6 operational 5000 5000 768 180000 180000 175548
*T5 = min_rx * multiplier
Cfg : Configured Time
Cur : Current Time
Rem : Remaining Time
VSS: Looking at LMP
Any link failures detected by LMP in the past ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 134BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsDUT#show vslp lmp counters
Instance #1:
LMP counters
Tx Rx
Interface OK Fail Bidir Uni Fail Bad
--------------------------------------------------------------------
Te1/5/4 12649 0 12675 1 0 0
Te1/3/3 12000 0 12024 0 0 0
Te1/3/4 11999 0 12024 0 0 0
Te1/3/6 12001 0 12025 0 0 0
Rx error details
Interface My info My info Bad MAC Bad switch Domain id Peer info
mismatch absent Address id mismatch mismatch
-------------------------------------------------------------------------------
Te1/5/4 0 1 0 0 0 0
Te1/3/3 0 0 0 0 0 0
Te1/3/4 0 0 0 0 0 0
Te1/3/6 0 0 0 0 0 0
DUT#clear vslp lmp counters ?
interface Interface
<cr>
VSS: Looking at LMP LMP packets tx’edto the VSL peer
Problem sending LMP packet to the VSL peer ?
Packets received from VSL peer with our info, proving the link is bidirectional
Packets received from VSL peer without our info, proving the link is unidirectional at that moment; when the first link comes up, the first packet will always be a unidir packet
Problem receiving LMP packet to the VSL peer ?
Receiving incorrect LMP packets ?
Why receiving incorrect LMP packets ……configuration error ?
OK, 1 unidirectional packet when first link in VSL came up
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
68
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 135BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsDUT#show switch virtual role detail
Switch Switch Status Preempt Priority Role Session ID
Number Oper(Conf) Oper(Conf) Local Remote
------------------------------------------------------------------
LOCAL 1 UP TRUE (Y*) 200(200) ACTIVE 0 0
REMOTE 2 UP TRUE (Y*) 100(100) STANDBY 2977 3643
Standby configured preempt timer(switch 2): 5 minutes
Active configured preempt timer(switch 1): 5 minutes
RRP Counters:
--------------------------------------------------------------------
Inst. Peer Direction Req Acc Est Rsugg Racc
----------------------------------------------------------------------
1 1 Tx 0 2 0 2 6
1 1 Rx 2 0 2 0 6
RRP FSM info
----------------------------------------------------------------------
sm(vslp_rrp RRP SM information for Instance 1, Peer 1), running yes, state role_res
Last transition recorded: (req)-> hold (srt_exp)-> hold (est)-> role_neg (srt_exp)-> role_neg (est)-> role_neg (racc)-> role_res (srt
_. . .
In dual-active recovery mode: No
DUT# show vslp rrp ?
counters Counters
detail detail information
fsm Finit State Machine (FSM) information
summary Summary information
VSS: Looking at RRP
Same information as in “show virtual role detail”
State machine info on RRP protocol; current state is “role resolved”
Check 1 is active, 1 is standby
Switch is not in dual active recovery mode
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 136BRKRST-314314664_05_2008_c2
VSS System Control Plane Debugs
What if VSL fails ?2 separate physical switches with identical configuration on same network = troubleSolution: dual active detection mechanisms will put make sure only one “active”switch has interfaces up, other active switch will be in recovery mode (all of it’s interfaces down)
Enhanced PAgP based:New “Dual Active” Type-Length-Value field in PAgP is used to insert info on which physical switch is active in the VSS in PAgP packets to/from remote switch attached to VSS
Requires: Multi-chassis Ether Channel (MEC) with at least one interface member from both switches in the VS, must be running the PAgP protocol, with at least one side’s mode configured as desirable.PAgP dual-active detection mechanism must be enabled (“dual-active detection pagp” command)The specific port channel must be “trusted” to be used for dual-active detection (“dual-active detection pagp trust channel-group” commandThe MEC neighbor switch must be running an image capable of supporting the enhanced PAgP dual-active tlvs.
VSS: Enhanced PAgP Based Dual Active Detection
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
69
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 137BRKRST-314314664_05_2008_c2
VSS System Control Plane Debugs
BFD based: Bfd dual-active detection requires that the pair of interfaces being used for this method be directly connected via a cable.
Requires:An ip address must be configured on the interface.
The two interfaces must be on a different subnet.
Bfd interval parameters must be configured on the interfaces.
The bfd dual-active detection mechanism must be enabled – this is configured using the “dual-active detection bfd” command.
The pair of interfaces to be used in the detection mechanism must be specified using the “dual-active pair interface” command.
The BFD neighbors are not created until the VSL fails, BFD neighbor establishment is the trigger of the dual-active detection !!
VSS: BFD Based Dual Active Detection
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 138BRKRST-314314664_05_2008_c2
Po2Gig5/9 Gig2/6/12
Gig1/9/36Gig4/16Gig2/9/15Gig2/2Gig1/6/2Gig5/2Gig1/5/1Gig2/4
Po1
VSS system control plane debugsVSS: dual active detection setup
R1 DUTR2
8.0.1.1 9.0.1.2
Ten1/3/2 Ten1/1
Ten2/2/7 Ten1/4
Gig1/6/1
Gig2/9/1
Trusted Port-channel2
BFD direct connectionGig1/6/1-Gig2/9/1
Port channel 2 is the only port channel trusted for dual active detectionBFD direct connection is between Gig1/6/1 and Gig2/9/1Both mechanisms can be on simultaneously
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
70
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 139BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection troubleshootingDUT#sh startup-config | b switch virtual
switch virtual domain 1
switch mode virtual
...
dual-active detection pagp trust channel-group 2
dual-active pair interface GigabitEthernet1/6/1 interface GigabitEthernet2/9/1 bfd
dual-active exclude interface GigabitEthernet1/5/3
dual-active exclude interface GigabitEthernet2/5/3
!
interface GigabitEthernet1/6/1
no switchport
ip address 100.10.10.9 255.255.255.252
bfd interval 50 min_rx 50 multiplier 3
end
interface GigabitEthernet2/9/1
no switchport
ip address 100.10.10.13 255.255.255.252
bfd interval 50 min_rx 50 multiplier 3
...
ip route 100.10.10.8 255.255.255.252 GigabitEthernet2/9/1
ip route 100.10.10.12 255.255.255.252 GigabitEthernet1/6/1
Truncated display
Port channel 2 is trusted for Enhanced PAgP dual active detection Interface pair for BFD
dual active detection
Interfaces excluded from recovery mode, they will not go down in case the switch ends up in recovery mode
BFD configuration on interface pair for BFD dual active detection; notice they are in different subnets !! Exclude these from redistribution in routing protocols.
Automatically added with “dual-active pair”command, required to be present as the directly connected interfaces are in different subnets !! Don’t redistribute these static routes.
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 140BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection TroubleshootingDUT#show switch virtual dual-active summary
Pagp dual-active detection enabled: Yes
Bfd dual-active detection enabled: Yes
Interfaces excluded from shutdown in recovery mode:
Gi1/5/3
Gi2/5/3
In dual-active recovery mode: No
DUT#show switch virtual dual-active pagp
PAgP dual-active detection enabled: Yes
PAgP dual-active version: 1.1
Channel group 2 dual-active detect capability w/nbrs
Dual-Active trusted group: Yes
Dual-Active Partner Partner Partner
Port Detect Capable Name Port Version
Gi1/5/1 Yes R1 Gi2/4 1.1
Gi1/6/2 Yes R1 Gi5/2 1.1
Gi1/9/36 Yes R1 Gi4/16 1.1
Gi2/6/12 Yes R1 Gi5/9 1.1
Gi2/9/15 Yes R1 Gi2/2 1.1
Channel group 3 dual-active detect capability w/nbrs
Dual-Active trusted group: No
. . .
Port channel 2 is trusted for Enhanced PAGP dual active detection … at least 1 trusted port channel needed !!
Check that the neighbor runs a SW version that supports Enhanced PAGP … if not, no dual active detection !!
At least 1 port channel member on each switch id !!
Port channel 3 is not trusted for Enhanced PAGP dual active detection
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
71
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 141BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection TroubleshootingDUT#show switch virt dual-active bfd
Bfd dual-active detection enabled: Yes
Bfd dual-active interface pairs configured:
interface-1 Gi1/6/1 interface-2 Gi2/9/1
DUT#
Triggering dual active situation by shutdown of VSL (switch id 2 side) DUT#show int po 255 | i Member
Members in this channel: Te2/2/3 Te2/2/6 Te2/2/8 Te2/5/4
DUT#conf t
Enter configuration commands, one per line. End with CNTL/Z.
DUT(config)#int range ten 2/2/3 , Te2/2/6 , Te2/2/8 , Te2/5/4
DUT(config-if-range)#shutdown
*Apr 1 12:40:22.885 CET: %PAGP_DUAL_ACTIVE-SW1_SP-1-RECOVERY: PAgP running on Gi1/5/1 triggered dual-active recovery: active id 0011.bc75.4400 received, expected 0011.5d54.6800
*Apr 1 12:40:22.945 CET: %DUAL_ACTIVE-SW1_SP-1-DETECTION: Dual-active condition detected: all non-VSL and non-excluded interfaces have been shut down
Truncated display
Enhanced PAGP detected both switch id’s were active at the same time
Originally active switch id (switch 1 in example) goes into recovery mode
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 142BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection TroubleshootingDUT#show switch virtual role
Switch Switch Status Preempt Priority Role Session ID
Number Oper(Conf) Oper(Conf) Local Remote
------------------------------------------------------------------
LOCAL 1 UP TRUE (Y*) 200(200) ACTIVE 0 0
Active configured preempt timer(switch 1): 5 minutes
In dual-active recovery mode: Yes
Triggered by: PAgP detection
Triggered on interface: Gi1/5/1
Received id: 0011.bc75.4400
Expected id: 0011.5d54.6800
DUT#
Alternative: “show switch virtual dual-active summary”
On switch in recovery mode, all interfaces except for the ones excluded from recovery mode should be down: quick check via “show ip interface brief | i up” that only the ones allowed are up
On switch id 1: originally active, now in recovery mode
Doesn’t see switch id 2 (as VSL is still down)
Mechanism that detected dual active was Enhanced PAgP, via link 1/5/1
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
72
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 143BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection Troubleshooting
Trying to bring the system back up via “no shutdown” of VSL port channel: you need to do this on both sides (active switch in recovery mode as well as the real active switch id 2 at this point in time)
DUT#conf t
Enter configuration commands, one per line. End with CNTL/Z.
DUT(config)#int range ten 2/2/3 , Te2/2/6 , Te2/2/8 , Te2/5/4
DUT(config-if-range)#no sh
DUT(config-if-range)#
DUT#
*Apr 1 12:49:29.513 CET: %DUAL_ACTIVE-1-VSL_RECOVERED: VSL has recovered during dual-active situation: Reloading switch 1
*Apr 1 12:49:29.513 CET: %VS_GENERIC-5-VS_CONFIG_DIRTY: Configuration has changed. Ignored reload request until configuration is saved
Switch in recover mode should reloadand come back up as standby !!
Configuration has been modified (sh/no sh), it needs to be saved before it will recover; if not and configurations are possibly out of sync (any “conf t” has been issued without saving while the VSS was still up), standby mode will be RPR+ until we manually save/sync the configuration and reset standby;
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 144BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection Troubleshooting
For reference: console logs on switch id 2 (standby -> active upon VSL failure)*Apr 1 12:40:08.032 CET: %VSLP-SW2_SPSTBY-3-VSLP_LMP_FAIL_REASON: Te2/2/6: Link down
*Apr 1 12:40:12.825 CET: %VSLP-SW2_SPSTBY-3-VSLP_LMP_FAIL_REASON: Te2/2/8: Link down
*Apr 1 12:40:20.096 CET: %VSLP-SW2_SPSTBY-2-VSL_DOWN: Last VSL interface Te2/5/4 went down
*Apr 1 12:40:20.096 CET: %VSLP-SW2_SPSTBY-2-VSL_DOWN: All VSL links went down while switch is in Standby role
*Apr 1 12:40:20.096 CET: %DUAL_ACTIVE-SW2_SPSTBY-1-VSL_DOWN: VSL is down switchover, or possible dual-active situation has occurred
*Apr 1 12:40:20.100 CET: %PFREDUN-SW2_SPSTBY-6-ACTIVE: Initializing as Virtual Switch ACTIVE processor
DUT#show switch virtual role
Switch Switch Status Preempt Priority Role Session ID
Number Oper(Conf) Oper(Conf) Local Remote
------------------------------------------------------------------
LOCAL 2 UP TRUE (Y*) 100(100) ACTIVE 0 0
Active configured preempt timer(switch 2): 5 minutes
In dual-active recovery mode: NoSwitch id 2 is now the “real” active switch, and doesn’t see switch id 2 as long as the VSL is down !!
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
73
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 145BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Detection Troubleshooting
For reference: console logs on switch id 2 (standby -> active upon VSL failure)DUT#conf t
Enter configuration commands, one per line. End with CNTL/Z.
DUT(config)#int range ten 2/2/3 , Te2/2/6 , Te2/2/8 , Te2/5/4
DUT(config-if-range)#no sh
*Apr 1 12:49:32.781 CET: %LINK-SW2_SP-3-UPDOWN: Interface TenGigabitEthernet2/5/4 changed state to up
*Apr 1 12:49:49.128 CET: %VSLP-SW2_SP-5-VSL_UP: Ready for Role Resolution with Switch=1, MAC=0011.5d54.6800 over Te2/2/6
*Apr 1 12:49:50.320 CET: Initializing as Virtual Switch ACTIVE processor
*Apr 1 12:49:52.140 CET: %VSLP-SW2_SP-5-RRP_MSG: Peer Switch with unsaved configurations needs to be reloaded.
Please save relevant configurations on the peer switch and reload it.
BFD based mechanism: similarIf no dual-active detection method was enabled, and VSL recovers, RRP determines which switch stays active and which reloads to become standby, based on the switch number and priority configurations. In case of a “dirty configuration” (if any “conf t”command is issued), it will put the “to-become-standby” switch into recovery and wait for manual reload command.
“Unshutting” the VSL links on both switch id’s
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 146BRKRST-314314664_05_2008_c2
VSS System Control Plane DebugsVSS: Dual Active Debug Commands
For reference, and use with caution:debug switch virtual dual-active detect bfd events (on RP ONLY)
debug switch virtual dual-active detect summary
general dual-active debugging when going into recovery mode
debug pagp dual-active (SP ONLY)
enable PAgP dual-active specific debugging
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
74
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 147BRKRST-314314664_05_2008_c2
VSS Specific Troubleshooting
VSS test topology network diagram
VSS system control plane debugs
VSS specific L2/L3 packet flow troubleshootingWhich counters and (forwarding) tables to look at
Some useful troubleshooting tools
VSS: What to Check ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 148BRKRST-314314664_05_2008_c2
VSS L2/L3 Forwarding (Data Plane)
Multi-chassis Ether Channel (MEC): Modify the hash so that links on local physical switch get preferred to transmit packet, instead of links on remote switch
Equal Cost Multi Path (ECMP):Modify adjacency table to prefer next hops attached on local switch; only select paths on remote switch if no local paths areavailable
Knowing this, similar commands (enhanced with option to give switch id as input)/steps can be used as in standalone;
VSS Data Plane Design: Minimal Load on the VSL
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
75
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 149BRKRST-314314664_05_2008_c2
Po2Gig5/9 Gig2/6/12
Gig1/9/36Gig4/16Gig2/9/15Gig2/2Gig1/6/2Gig5/2Gig1/5/1Gig2/4
Po1
VSS L2/L3 Forwarding SetupVSS: Data Path Test Setup
DUT learns 8.0.1.0/24 via ECMP on VLAN’s 701 to 705 over port channel 2DUT learns 9.0.1.0/24 via ECMP on L3 interfaces Ten1/3/2 and Ten2/2/7 Launching ping from 8.0.1.1 to 9.0.1.2Using similar commands/steps as in Unicast L3 troubleshooting to find out the path, only VSS specifics are highlighted in next slides
R1DUT
R28.0.1.1 9.0.1.2
Ten1/3/2 Ten1/1
Ten2/2/7 Ten1/4
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 150BRKRST-314314664_05_2008_c2
VSS L2/L3 Forwarding (Data Plane)
Verify the load-balance algorithm usedDUT#show etherchannel load-balance switch 2 mod 2
EtherChannel Load-Balancing Configuration:
src-dst-ip enhanced
mpls label-ip
EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address
MPLS: Label or IP
Identify the physical interface flow to host 1 (out of Port-channel 2) will useDUT#show etherchannel load-balance hash-result interface Port-channel 2 switch 2 ip 9.0.1.2 8.0.1.1
Computed RBH: 0x3
Would select Gi2/9/15 of Po2
DUT#show etherchannel load-balance hash-result interface Port-channel 2 switch 1 ip 9.0.1.2 8.0.1.1
Computed RBH: 0x3
Would select Gi1/6/2 of Po2
For MEC, load-balance should prefer physical interfaces local to the switch the packet was received on
VSS Data Plane Troubleshooting L2 MEC
Important: depending on the type of load balancing used, use different arguments, e.g. in case of dst-ip, only give the destination ip as argument … otherwise command doesn’t work correctly
Packet coming in on switch id 2, needing to go out on Po2 will select Gi2/9/15
Packet coming in on switch id 1, needing to go out on Po2 will select Gi1/6/2
What type of etherchannel load balancing is being used on this module ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
76
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 151BRKRST-314314664_05_2008_c2
VSS L2/L3 Forwarding (Data Plane)Routing table shows 2 Equal Cost Paths to 9.0.1.0/24
DUT#show ip route 9.0.0.0 | i via
Known via "eigrp 101", distance 90, metric 3072, type internal
Redistributing via eigrp 101
7.7.1.2, from 7.7.1.2, 1d00h ago, via TenGigabitEthernet2/2/7
* 7.6.1.2, from 7.6.1.2, 1d00h ago, via TenGigabitEthernet1/3/2
Looking at the HW table shows next hop directly attached to local switch is preferred
DUT#show mls cef lookup 9.0.1.0 switch 1 mod 3
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
108775 9.0.0.0/8 Te1/3/2 , 000f.35ed.7c00
DUT#show mls cef lookup 9.0.1.0 switch 2 mod 2
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
108775 9.0.0.0/8 Te2/2/7 , 000f.35ed.7c00
DUT#show mls cef exact-route 8.0.1.1 0 9.0.1.2 0 switch 1 mod 3
Interface: Te1/3/2, Next Hop: 7.6.1.2, Vlan: 4064, Destination Mac: 000f.35ed.7c00
DUT#show mls cef exact-route 8.0.1.1 0 9.0.1.2 0 switch 2 mod 2
Interface: Te2/2/7, Next Hop: 7.7.1.2, Vlan: 4056, Destination Mac: 000f.35ed.7c00
Further, use similar commands (enhanced with extra argument of switch id) as in standalone switch
VSS Data Plane Troubleshooting ECMP
Packet coming in on switch 1 module 3, for 9.0.0.0/8 prefers next hop attached to local switch id 1
Packet coming in on switch 2 module 2, for 9.0.0.0/8 prefers next hop attached to local switch id 2
? ... show vlan internal usage | I 4064
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 152BRKRST-314314664_05_2008_c2
VSS Specific Troubleshooting
VSS test topology network diagram
VSS system control plane debugs
VSS specific L2/L3 packet flow troubleshootingWhich counters and (forwarding) tables to look at
Some useful troubleshooting tools
VSS: What to Check ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
77
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 153BRKRST-314314664_05_2008_c2
VSS Troubleshooting ToolsVirtual Slot Numbers: some log messages can display virtual slot numbers, to identify matching switch id/module number:
DUT#show switch virtual slot-map
Virtual Slot to Remote Switch/Physical Slot Mapping Table:
Virtual Remote Physical Module
Slot No Switch No Slot No Uptime
---------+-----------+----------+----------
17 1 0 -
18 1 0 -
19 1 3 1d01h
20 1 0 -
Capture all info specific to VSS:show switch virtual troubleshooting all
VSS Additional Useful Commands
No module present in switch id 1, slot 1
Module present in virtual slot id 19, maps to switch id 1, slot 3,
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 154BRKRST-314314664_05_2008_c2
Troubleshooting VSS
VSS control plane issues: VSS doesn’t form, dual active, dirty configuration …
Checked using VSS specific commands
VSS data plane forwarding (L2/L3)Checked what is different in VSS …
Troubleshooting VSS data plane is pretty much the same as standalone, step-by-step, no steps skipping !!
Problems We’ve Looked at
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
78
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 155BRKRST-314314664_05_2008_c2
Now What … ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 156BRKRST-314314664_05_2008_c2
What Did We Just Talk About?
Path verification. “Get oriented’Great time to have good diagrams.
Looking at counters and HW forwarding tables for IP unicast, multicast, VSS ?
Check HW/SW consistency …
There is more reference material in the appendices on:QOS, WS-Sup32P, Modular IOS, HW Health monitoring
Still … I need TAC assistance …how about a cheat sheet?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
79
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 157BRKRST-314314664_05_2008_c2
Collect any syslogs or tacacs logstelnet to switch
On RPterminal length 0
show logshow clockshow tech
Show tech platform
On Route Processor (RP)show scp accountingshow scp countersshow eobcshow ibcshow ipc statusshow ipc portsshow heartbeatshow fabric errorsshow fabric utilizationshow fabric channel
On Switch Processor (SP)show scp accountingshow scp countersshow eobcshow ibcshow earl statusshow earl statisticsshow fabric errorsshow fabric timeoutshow ipc statusshow ipc portsshow heartbeatsh platform hard superman configshow platform hard tycho interrupt
Routing
Supervisor Failover
On RPShow platform tech unicast <..>show ip arpshow ip cefshow adjacency detailshow ip routeshow ip ospf statisticsshow ip ospf data datashow ip ospf neighshow ip bgp neighborshow ip bgp summary show ip eigrp neighborshow ip eigrp topologytraceroute <w.x.y.z>show mls cef summaryshow mls cefshow mls cef adjacency
On RPshow module <mod>show idprom all detailshow powershow diagnostic result <mod>
Module
Send data to Cisco TAC and attach to case
Open Case with Cisco TACP1/P2 Phone only!
P3/P4 – email
Catalyst 6500Sup720 Native IOSTroubleshooting
Procedure
On RPshow platform tech-support ipmulticast <..>show tech ipmulticastshow ip mrouteshow mls ip multi connectedshow mls ip multi statisticsshow mls ip multi sumshow mls ip multi group <ip> source <ip>show mls rp ip
On SPshow mmls v g gshow mls cef ip multicast detail
Multicast
Include:1. Brief Description2. Bridge number3. Hostname and IP
Log your session to your Desktop!!
Determine Problem Type
On SPshow mls cef ip detailshow mls cef inconsistencyshow mls cef summary
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 158BRKRST-314314664_05_2008_c2
Q and A
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
80
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 159BRKRST-314314664_05_2008_c2
Recommended Reading
Continue your Cisco Live learning experience with further reading from Cisco Press
Check the Recommended Reading flyer for suggested books:
Cisco LAN Switching Fundamentals (by David Barnes, Basir Sakandar)
Cisco Catalyst QoS: Quality of Service in Campus Networks (by Richard Froom, Mike Flannagan, Kevin Turek)
Available Onsite at the Cisco Company Store
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 160BRKRST-314314664_05_2008_c2
Complete Your Online Session Evaluation
Give us your feedback and you could win fabulous prizes. Winners announced daily.
Receive 20 Passport points for each session evaluation you complete.
Complete your session evaluation online now (open a browser through our wireless network to access our portal) or visit one of the Internet stations throughout the Convention Center.
Don’t forget to activate your Cisco Live virtual account for access to all session material on-demand and return for our live virtual event in October 2008.
Go to the Collaboration Zone in World of Solutions or visit www.cisco-live.com.
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
81
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 161BRKRST-314314664_05_2008_c2
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 162BRKRST-314314664_05_2008_c2
Appendices: Reference Materials
QoS troubleshooting
WS-SUP32P (PISA) troubleshooting
Modular IOS troubleshooting
Monitoring the health of the system (GOLD/EEM)
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
82
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 163BRKRST-314314664_05_2008_c2
Appendices: Reference Materials
QoS troubleshooting
WS-SUP32P (PISA) troubleshooting
Modular IOS troubleshooting
Monitoring the health of the system (GOLD/EEM)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 164BRKRST-314314664_05_2008_c2
QoS Common Issues
Marking and Policing
Flow mask conflicts (micro-flow policing)
Unsupported configurations
QoS Gotcha’s
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
83
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 165BRKRST-314314664_05_2008_c2
Policing and Marking
Use traffic that will produce reliable results with a PolicerTCP traffic will yield rates below the CIR due to the slow-start algorithm and retransmissions.
A traffic generator should be used to source and receive the traffic.This allows complete control of the ingress rate and an accuratemeasure of the egress rate.
UDP traffic can also be used since it does not use the slow-start algorithm or suffer from retransmissions.
Any type of traffic should be OK for testing marking
Use Appropriate Traffic to Test Policing
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 166BRKRST-314314664_05_2008_c2
Policing and Marking
Make sure the desired traffic is passing through the interface to which the policer is applied
Use a packet sniffer to capture the ingress traffic and check the destination MAC address on the frames
Make sure the desired traffic is in accord with the match criteria in the class-map
Examine the packets in the sniffer trace to see if they match
How Do I Verify Correct Policing or Marking ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
84
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 167BRKRST-314314664_05_2008_c2
Policing and Marking
Are you using a supported match criteria in your class-map?
Supportedmatch precedencematch dscpmatch access-group
Not Supportedmatch cosmatch class-mapmatch source-addressmatch destination-addressmatch input-interfacematch qos group
Match Criteria
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 168BRKRST-314314664_05_2008_c2
Policing and Marking
Make sure the ingress interface is configured for port-based or VLAN-based QoS in agreement with how the service-policy is applied.
! policy-map police-host-to-hostclass host-to-hostpolice cir 9000000 bc 281250 be 281250 conform-action set-dscp-transmit cs5 exceed-action drop violate-action drop
!interface Vlan20ip address 20.20.20.1 255.255.255.0ip pim sparse-dense-modeload-interval 30service-policy input police-host-to-host
interface GigabitEthernet9/1switchportswitchport access vlan 20switchport mode accessmls qos vlan-based
Cat6K#show mls qosQoS is enabled globallyPolicy marking depends on port_trustQoS ip packet dscp rewrite enabled globallyInput mode for GRE Tunnel is Pipe modeInput mode for MPLS is Pipe mode
QoS is vlan-based on the following interfaces:Gi9/1
Port-Based and VLAN-Based QOS
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
85
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 169BRKRST-314314664_05_2008_c2
Policing and Marking
Use show mls qos last [module [slot]] to see a snapshot of the last packet switched by the hardware.
Cat6K#show mls qos last
----- Module [5] -----QoS last packet policing information:
---------------------------------------------------------------------Packet was droppedPacket L3 Prot: 0, packet length: 1518, dont_plc: NoInput COS: 0, TOS/DSCP: 0x0/0Output TOS/DSCP: 0xA0/40[rewritten] Output COS: 5[rewritten]Output MPLS EXP (if outgoing packet is MPLS): 5---------------------------------------------------------------------Aggregate policer index: Input - 1, Output - 0(none)thr_hi_ip: 0x44D leak_hi_ip: 0x233 drop_ena_ag_ip: Yesthr_lo_ip: 0x44D leak_lo_ip: 0x233thr_hi_op: 0x0 leak_hi_op: 0x3FF drop_ena_ag_op: Nothr_lo_op: 0x0 leak_lo_op: 0x3FF---------------------------------------------------------------------Microflow policer index: Input - 0(none), Output - 0(none)---------------------------------------------------------------------Netflow policer: nf_hit: Yes, nf_addr: 0x83, snap-shot matchesNT&NS: l3_prot: 1(0), 172.16.25.1.0x0000 ==> 10.10.10.100.0x0000
Shows that the last packet was dropped
Shows last packet Source ==> destination
along with L4 port numbers
Note: May be difficult to catch the interesting traffic as this will show the last packet switched in hardware.
Shows input and output CoS, ToS and DSCP
for the packet
How Do I Verify the Correct Policing or Marking ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 170BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
The NetFlow feature collects traffic statistics about the packets that flow through the switch and stores the statistics in the NetFlow table.
NetFlow Table: Resides on the PFC and is divided into two pieces
Netflow Key Table: Stores the actual flows
Netflow Statistics Table: Stores the flow information like number of packets and number of bytes switched per flow
Netflow
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
86
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 171BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
A flow is identified using the following fields:Source IP AddressDestination IP AddressSource TCP/UDP Port NumberDestination TCP/UDP Port NumberIP Protocol TypeInput VLAN
Which fields are used to identify and store flows in the NetFlow table?The PFC uses a flow mask to identify which of the fields are used to identify and store the flow
What Is a Flow ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 172BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
VLAN SRC IP DST IP IP Protocol Src Port Dst Port
VLAN SRC IP DST IP IP Protocol Src Port Dst Port
VLAN SRC IP DST IP IP Protocol Src Port Dst Port
VLAN SRC IP DST IP IP Protocol Src Port Dst Port
VLAN SRC IP DST IP IP Protocol Src Port Dst Port
VLAN SRC IP DST IP IP Protocol Src Port Dst Port
Full-Interface
Full
Destination-Source-Interface
Source-only
Destination
Destination-Source
Flow Masks
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
87
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 173BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
Several Catalyst 6500 features use the NetFlow Table for their operation. These include:
Micro-flow PolicingNetFlow Data Export (NDE) IOS-SLBReflexive ACLs
TCP-Intercept WCCPCBACNAT/PAT
These requirements can conflict with one another
Each feature requires a specific flow mask when configured
Netflow Features
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 174BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
If any of the NetFlow features are configured along with a micro-flow service policy on a given interface, the flow mask for the micro-flow policer must be Full
When a conflict occurs with NDE, the first feature configured will take precedence and the later will get a flow mask conflict
When a conflict occurs with one of the other NetFlowfeatures, micro-flow policing will take precedence and the other feature will be processed in software
Conflicts with Micro-Flow Policing
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
88
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 175BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
When you configure the second feature that causes conflict with a previously existing feature a log message will be generated.
Cat6K(config)#int vlan 20
Cat6K(config-if)#service-policy input police-host-to-host
Cat6K(config-if)#
QoS-ERROR: QoS policy on interface Vl20 cannot be successfully installed due to the interaction with other feature configuration
Failure reason is Unresolvable flowmask conflict with other features
QoS-ERROR: installation of policy on Vl20 failed
5w2d: %FM-2-FLOWMASK_CONFLICT: Features configured on interface Vlan20 have conflicting flowmask requirements, traffic may be switched in software
How Do I Know There Is a Conflict ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 176BRKRST-314314664_05_2008_c2
Flow Mask Conflicts
Cat6K#show fm interface vlan 20
Interface: Vlan20 IP is enabled
hw_state[INGRESS] = not reduced, hw_state[EGRESS] = not reduced
mcast = 1
priority = 0
flags = 0x0
parent[INGRESS] = none
inbound label: 35
Feature FM_GUARDIAN:
Features Bumping the flowmask on the interface:
NDE
Feature NAT_INGRESS:
In this case it’s NDE
How Can I Tell What Feature Is the Cause of the Problem ?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
89
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 177BRKRST-314314664_05_2008_c2
Unsupported Features
The following are unsupported policy map class commands for PFC/DFC QoS
Bandwidth
Priority
Queue-limit
Random-detect
Set qos-group
Service policy (nested policies are not supported)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 178BRKRST-314314664_05_2008_c2
Unsupported Features
None of the following are supported on Ethernet modules
CBWFQ
LLQ
WRED
Class-based Shaping
Hierarchical Traffic Shaping
***All are supported on OSM’s, FlexWAN and Enhanced FlexWAN. See QoSconfiguration guide for OSM and FlexWAN modules for specific Caveats
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
90
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 179BRKRST-314314664_05_2008_c2
QOS Gotchas
Each PFC and DFC polices independentlyThis will affect policers applied to port-channels and SVI’s
Egress policing is applied at the ingress interfaceThe ingress PFC/DFC makes the policing decision NOT the egress PFC/DFC
Ingress and Egress policing applied to the same traffic must have the same policy
Both must mark down or both must drop
Egress ACL uses ingress marking by default
Remember the Following Points !!
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 180BRKRST-314314664_05_2008_c2
Appendices: Reference Materials
QoS troubleshooting
WS-SUP32P (PISA) troubleshooting
Modular IOS troubleshooting
Monitoring the health of the system (GOLD/EEM)
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
91
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 181BRKRST-314314664_05_2008_c2
WS-Sup32P – PISA
Generic L2/L3 troubleshooting is similar to previous sections
PISA specific troubleshooting focus is on packets going to/coming from the PISA daughter card
What tables to look at
What counters to look at
Programmable IP Services Accelerator (PISA)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 182BRKRST-314314664_05_2008_c2
WS-Sup32P – PISAProgrammable IP Services Accelerator (PISA)
PFC3BDaught
erCard
Bus
L3/4 Engine
L2 Engine
ReplicationEngine
GE Uplinks
PISA Daughter Card
Micro Engines
CPU
NetworkProcess
or
DRAM1 GB
DRAM1 GB
DRAM768 MBDRAM768 MB
32MSRAM32M
SRAM
DRAM512 MBDRAM512 MB
10G
PISA Channel
RP CPURP CPU
SP CPUSP CPU
Port ASICClassificationand Dispatch Engine PISA
Classificationand Dispatch Engine PISA 1-3G
1 Gbps
1 Gbps
Supervisor Engine 32 Baseboard
Incoming packet on bus gets redirected to PISA based on PFC3B ACL redirect (ingress) or modified FIB entry (egress)
Packet back to EARL for FIB lookup (ingress case) or L2 lookup /VACL (egress case)
L2/L3 Engine counters& tables
Up to 3 Gbps internal EtherChannel interface for PISA connection (Po256)
Port counters
CDE counters
Network Processor Accelerates NBAR and FPM at up to 2 Gbps
CDE redirects packets to NP or RP
NP counters
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
92
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 183BRKRST-314314664_05_2008_c2
WS-Sup32P – PISA
Verify the internal PISA port channel (Po256 is up)
Verify L2/L3 Forwarding Engine on PFC3B redirects packets correctly to PISA
Look at the internal port channel counters (both port ASIC as CDE side) and PISA specific tables
Some useful commands and tools
Troubleshooting Sequence
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 184BRKRST-314314664_05_2008_c2
WS-Sup32PIs the Internal Port Channel to PISA OK ?
DUT#sh int po 256
Port-channel256 is up, line protocol is up (connected)
input flow-control is on, output flow-control is on
Members in this channel: Gi6/8 Gi6/10
DUT#sh running-config
interface Port-channel256
mtu 4160
…
flowcontrol receive on
flowcontrol send on
pisa-channel
interface GigabitEthernet6/8
mtu 4160
…
flowcontrol receive on
flowcontrol send on
no cdp enable
channel-group 256 mode on
Truncated output
Verify flowcontrol, MTU and pisa-channel configuration on port channel interface, and its physical members;
Truncated output
Extra command: show etherchannel 256 detail
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
93
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 185BRKRST-314314664_05_2008_c2
WS-Sup32PDo the Packets Get Punted to PISA ?
DUT#show class-map TELNET-traffic
Class Map match-all TELNET-traffic (id 2)
Match protocol telnet
DUT#show policy-map
Policy Map Vlan701
Class TELNET-traffic
set dscp af42
DUT#conf t
Enter configuration commands, one per line. End with CNTL/Z.
DUT(config)#int vlan 701
DUT(config-if)#service-policy input Vlan701
DUT(config-if)#
04:24:24: %PISA-6-NBAR_ENABLED: feature accelerated on input direction of: Vlan701
04:24:24: %PISA-6-NBAR_ENABLED: feature accelerated on output direction of: Vlan701^Z
DUT#
DUT#sh tcam interface vlan 701 acl in ip
* Global Defaults shared
Entries from Bank 0
Entries from Bank 1
permit ip any 224.0.0.0 15.255.255.255 (105 matches)
policy-route ip any any (110 matches)
deny ip any any
Configuration just for illustration
Applied service policy to Vlan 701
There is a “policy-route” type entry programmed in TCAM, meaning packet matching this will get redirected to PISA module
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 186BRKRST-314314664_05_2008_c2
WS-Sup32PDo the Packets Get Punted to PISA ?
DUT#sh tcam interface vlan 701 acl in ip detail
Interface: 701 label: 1537 lookup_type: 0
protocol: IP packet-type: 0
+-+-----+---------------+---------------+---------------+---------------+-------+---+----+-+---+--+---+---+
|T|Index| Dest Ip Addr | Source Ip Addr| DPort | SPort | TCP-F |Pro|MRFM|X|TOS|TN|COD|F-P|
+-+-----+---------------+---------------+---------------+---------------+-------+---+----+-+---+--+---+---+
V 36250 0.0.0.0 0.0.0.0 P=0 P=0 ------ 0 ---- 1 0 -- C-- 0-0 <-
M 36251 0.0.0.0 0.0.0.0 0 0 0 ---- 1 0 <-
R rslt: REDIRECT_ADJACENCY (*) rtr_rslt: PERMIT_RESULT (*) indx: 0x7E03 hit_cnt=118 <-
DUT#show mls cef adjacency entry 0x7F803 detail
Index: 522243 mtu: 65535, vlan: 0, dindex: 0x340, l3rw_vld: 1
format: RECIR, flags: 0xA0000001000E00
packets: 140, bytes: 8960
DUT#show table ltl module 6 start 0x340 end 0x340
LTL indexes from: 0x340 to 0x340 - slot: 6
Index Ports
---------+----------------------------------------------------
0x00340 8,10
DUT#show interface Port-channel 256 | i Members
Members in this channel: Gi6/8 Gi6/10
Truncated output
Calculate redirect index: 0x7E03 – 0x7E00 + 0x7F800 = 0x7F803
Gets redirected to internal port index 0x340, matching the port channel 256 to PISA module
Index 0x340 maps to interfaces 8 and 10 on module 6 (so Gi6/8 and Gi6/10), matches Po256 Members !! …. If not correct, packets won’t get punted to NP
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
94
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 187BRKRST-314314664_05_2008_c2
WS-Sup32P – PISA
Verify the internal PISA port channel (Po256 is up)
Verify L2/L3 Forwarding Engine on PFC3B redirects packets correctly to PISA
Look at the internal port channel counters (both port ASIC as CDE side) and PISA specific tables
Some useful commands and tools
Troubleshooting Sequence
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 188BRKRST-314314664_05_2008_c2
WS-Sup32PDo the Packets Get Out of Port Channels to PISA?
DUT#show interface port-channel 256 counters
DUT#show interface port-channel 256 counters errors
DUT#show interface port-channel 256
…
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 266000 bits/sec, 211 packets/sec
30 second output rate 273000 bits/sec, 211 packets/sec
DUT#show flowcontrol interface gi6/8
Port Send FlowControl Receive FlowControl RxPause TxPause
admin oper admin oper
----- -------- -------- -------- -------- ------- -------
Gi6/8 on on on on 0 0
DUT#show flowcontrol interface gi6/10
Port Send FlowControl Receive FlowControl RxPause TxPause
admin oper admin oper
----- -------- -------- -------- -------- ------- -------
Gi6/10 on on on on 0 0
Check counters on internal port channel are moving, any errors … ?
Indication CDE is flow controlling towards the port ASIC, because e.g. NP is too busy
Check the flow controlling status of the individual links in the internal port channel, if we get flow controlled, it indicates the CDE is backed up because e.g. NP is too busy
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
95
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 189BRKRST-314314664_05_2008_c2
WS-Sup32PDo the Packets Get into the CDE ?
DUT#show platform hardware pisa cde counters
GIGMAC0 RX-Counter = 221348961
GIGMAC1 RX-Counter = 0
GIGMAC2 RX-Counter = 1813
GIGMAC3 RX-Counter = 79673
GIGMAC0 TX-Counter = 221348817
…
GIGMAC0 RX-DRP-CNT = 0
…
GIGMAC0 RX-UNSZ-CNT = 0
…
GIGMAC0 RX-OVSZ-CNT = 0
…
GIGMAC0 TX-DRP-CNT = 0
…
SPI-CDP-TO-IXP TX-Cnt = 221256824
SPI-IXP-TO-CDP TX Cnt = 221256823
SPI-IXP Chan0-ERR-Cnt = 0
SPI-IXP Chan8-ERR-Cnt = 0
RP-TO-CDEP-FIFO-Cnt = 1386
CDEP-TO-RP-FIFO-Cnt = 79930
CDEP-RP-FIFO-CRC-Cnt = 0
Truncated output
Interface counters for GIGMAC’s on CDE side of internal port channel .. Do they move ?
CDE to NP (IXP 2800 complex) and IXP to CDE count … Do they move ? Errors ?
RP to CDE and CDE to RP count … Do they move ? Errors ?
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 190BRKRST-314314664_05_2008_c2
WS-Sup32PLooking at the NP Counters ?
DUT#show platform hardware pisa np ?
ME ME Counters
acl Access-list
all All
fpm Flexible Packet Matching Info
mqc Modular QoS CLI Info
nbar Network Based Application Recognition Info
rx Receive Engine Info
tx Transmit Engine Info
DUT#show platform hardware pisa np nbar counters
NBAR Statistics(ME2)
--------------------
NBAR Pkts Received : 325
NBAR Pkts Classified: 325
PD Pkts Received : 0
NBAR Pkts Out : 325
NBAR Debug 0 : 82
NBAR Debug 1 : 81
NBAR Debug 2 : 81
NBAR Debug 3 : 81
Truncated output
RX = what comes in from CDE, TX, what goes back to CDE
In our example, we did reclassification based on NBAR for telnet
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
96
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 191BRKRST-314314664_05_2008_c2
I.e. adjacency info for next hop will rewrite into split VLAN to send the egress (from PFC viewpoint) packet to PISA to apply egress feature
Packets leaving NP will be in a VLAN different from ingress VLAN, but associated with ingress VLAN: split VLAN
WS-Sup32PLooking at the Split VLAN …
DUT#sh platform software pisa split-vlan interface vlan 701
Codes: P - NBAR PD, N - NBAR, F - FPM, 0x380 - RP, 0x340 - IXP
Interface Vlan PisaVlan InFeat OutFeat DestIndex State
-------------------------------------------------------------------------------------
Vlan701 701 1022 N - - N - - 0x340 up
DUT#show mac-address-table vlan 1022
Legend: * - primary entry
age - seconds since last seen
n/a - not available
vlan mac address type learn age ports
------+----------------+--------+-----+----------+--------------------------
* --- 0006.52b4.8000 static No - Router
For egress side features, the only difference in the troubleshooting step is the fact that will be no Redirect ACL, but an internal vlan to which the FIB entry for the egress interface points, that is used to send the packet to the PISA channel.The egress feature is then applied and packt goes back to EARL for applying L2 features (like VACL) and forwarding.
Verify the split VLAN has the router MAC programmed,
After this the modified packet will get L3 switched by PFC3B
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 192BRKRST-314314664_05_2008_c2
WS-Sup32P Troubleshooting ToolsUsefull Commands and ToolsProviding info w.r.t. PISA HW status:show platform hardware pisa health
Get an idea on overall load on PISA:show platform hardware pisa np all
In case of a performance problem, e.g. BGP is flapping due to PISA channel congestion, or to simply skip certain type of traffic from being sent to PISA, user can enable the “SKIP ACL’ feature:DUT# show run int vlan 701
interface Vlan1
ip address 7.1.1.1 255.255.255.0
no ip redirects
platform ip features pisa access-group skip_bgp
service-policy input Vlan701
DUT# show access-list skip_bgp
Extended IP access list skip_bgp
10 deny tcp any any eq bgp
20 permit ip any any
Use deny acl entry to skip BGP, match all entry at bottom of access list is needed to send all the rest to PISA; remember, this is a debug tool !!
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
97
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 193BRKRST-314314664_05_2008_c2
Appendices: Reference Materials
QoS troubleshooting
WS-SUP32P (PISA) troubleshooting
Modular IOS troubleshooting
Monitoring the health of the system (GOLD/EEM)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 194BRKRST-314314664_05_2008_c2
Typical Problems with Modular IOS
Process Crash
Memory Leak
High CPU utilization
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
98
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 195BRKRST-314314664_05_2008_c2
Crashes
Crashes will require TAC involvment
Open a TAC service request and collect the following info:1. Crashinfo file
2. Core file (if configured so)
3. Show tech-support
4. What you were doing that made it crash!!
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 196BRKRST-314314664_05_2008_c2
Example of Process Crash Output
00:05:29: %DUMPER-3-PROCINFO: pid = 16427: (sbin/tcp.proc), terminated due to signal SIGTRAP, trace trap (not reset when caught) (Signal from user)00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: zero at v0 v100:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R0 00000000 00000000 00000004 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: a0 a1 a2 a300:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R4 7BC22298 00000000 00000000 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: t0 t1 t2 t300:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R8 00000000 00000000 00000000 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: t4 t5 t6 t700:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R12 00000000 00000000 00000000 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: s0 s1 s2 s300:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R16 00FDDFA0 00000000 00000000 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: s4 s5 s6 s700:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R20 00000000 00000000 00000000 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: t8 t9 k0 k100:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R24 00000000 722B3F4C 00000000 00000000 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: gp sp s8 ra00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R28 7828FF90 00FDDF60 00000000 72297450 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: sr lo hi bad 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R32 1001FC73 00000000 00000000 78288970 00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: cause pc epc00:05:29: %DUMPER-3-REGISTERS_INFO: 16427: R36 00800020 722B3F5C 00000000 00:05:29: %DUMPER-3-TRACE_BACK_INFO: 16427: (libc.so+0x2EF5C) (libc.so+0x12450) (s72033_rp-adventerprisek9_wan-58-dso-p.so+0x17C00) (libc.so+0x127AC) 00:05:30: %DUMPER-3-CRASHINFO_FILE_NAME: 16427: Crashinfo for process sbin/tcp.proc at bootflash:/crashinfo_tcp.proc-20050910-01284100:05:30: %DUMPER-3-CORE_FILE_NAME: 16427: Core for process sbin/tcp.proc at disk0:/tcp.proc.012842.dmp.Z00:05:31: %DUMPER-5-DUMP_SUCCESS: 16427: Core dump success00:05:31: %SYSMGR-3-ABNORMTERM: tcp.proc:1 (jid 91) abnormally terminated, restarted scheduled
Crashing process nameCrashing process ID
Crashinfofilename
and location
Core filename
and location
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
99
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 197BRKRST-314314664_05_2008_c2
Example of What Files to Collect After Crash
For previous slide tcp.proc process crash you need to collect the following files:
Cat6K#dir bootflash:Directory of bootflash:/
4 -rw- 139528 Sep 9 2008 19:28:42 -06:00 crashinfo_tcp.proc-20050910-012841
65536000 bytes total (64979832 bytes free)
Cat6K#dir disk0:Directory of disk0:/
1 -rw- 111923344 Sep 1 2008 10:26:54 -06:00 s72033-adventerprisek9_wan_dbg-vz.PP_R31_INTEG_050829
2 -rw- 112078968 Sep 9 2008 14:50:54 -06:00 s72033-adventerprisek9_wan_dbg-vz.pikespeak_r31_0908_1
3 -rw- 107608208 Sep 9 2008 18:50:04 -06:00 s72033-adventerprisek9_wan-vz.122-99.SX1010
4 -rw- 131517 Sep 9 2008 19:28:42 -06:00 tcp.proc.012842.dmp.Z
512040960 bytes total (180281344 bytes free)
Crashinfofilename
and location
Both filenames
encode the process that
crashed
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 198BRKRST-314314664_05_2008_c2
Restarting a Process
To restart a process use the command process restart [process]
Restarting the process produces a log message stating that the process has been respawned and a tracebackUse show processes detailed [process] to see that a process has been restarted
Cat6K#process restart tcp.procRestarting process tcp.proc
Cat6K#03:47:08: %SYSMGR-6-RESPAWN: Process tcp.proc:1 has been respawned : sysmgr.proc : (PID=20498, TID=14) : -Traceback=(s72033_rp-ipservices_wan-57-dso-p.so+0x11364) ([36:0]+0x134FC) ([36:0]+0xB418) ([25:-9]1+0x167C) ([35:0]+0x39B4) ([35:0]+0x3F48) ([0:-3]libc+0x252D4) ([7:0]+0x127AC) Cat6K#
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
100
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 199BRKRST-314314664_05_2008_c2
Restarting a Process
Cat6K#show processes detailed tcp.procJob Id: 97
PID: 45097Executable name: tcp.procExecutable Path: sbin/tcp.proc
Instance ID: 1Respawn: ON
Respawn count: 4Respawn since last patch: 4Max. spawns per minute: 30
Last started: Tue Apr 8 23:58:45 2008Process state: Run
Process Redundancy State: Active (last exit status : 2)Core: SHAREDMEM MAINMEM
Max. core: 0Mandatory: ON
Last restart userid: user1
New Process ID
Process name
Number of times
process has restarted
User who restarted the process. Requires AAA or
local login enabled
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 200BRKRST-314314664_05_2008_c2
Configuring a Core Dump
Use the exception flash command to enable a core file collection process
Up to 3 choices for file location are supported
Will try each location, in order, until saved or runs out of choices
Cat6K(config)#exception flash ?bootflash: Device namedisk0: Device namedisk1: Device namesup-bootflash: Device name
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
101
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 201BRKRST-314314664_05_2008_c2
Memory Leak
Memory leaks will also require TAC involvment
Open a TAC service request and collect the following info (several interations):1. Show clock*
2. Show memory
3. Show process memory detailed
Do several iterations of the above commands
* Show clock will give an indication of the leak rate
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 202BRKRST-314314664_05_2008_c2
Show Memory
Show memory gives a high level view of the leak
Look the used and free memory as a first indication of a problem
Cat6K#show clock*01:39:31.399 UTC Wed Apr 9 2008
Cat6K#show memorySystem Memory: 524288K total, 282464K used, 241824K free, 1000K kernel reservedLowest(b) : 233308160
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
102
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 203BRKRST-314314664_05_2008_c2
Show Process Memory Detailed
Show process memory detailed gives a more granular view
Cat6K#show processes memory detailedSystem Memory : 524288K total, 282464K used, 241824K free, 1000K kernel reservedLowest(b) : 233308160<SNIP>Process sbin/ios-base, type IOS, PID = 24600
156592K total, 59376K text, 31412K data, 76K stack, 65728K dynamicHeap : 67108864 total, 42759560 used, 24349304 free
Task TTY Allocated Freed Holding Getbufs Retbufs TaskName0 0 50898384 7511824 40820144 0 0 *Init* 0 0 45294808 44231528 1021560 0 0 *Neutrino*
182 0 913424 94112 934656 0 0 FM core 0 0 14614384 13934376 658832 4267800 0 *Dead*
170 0 466944 20456 432288 0 0 CEF process31 0 261024 288 270752 120600 0 EEM ED Syslog2 0 7067400 6980032 145408 0 0 Service Task
274 0 84672 18024 127328 0 0 QM Process on R29 0 191888 2168 101112 0 0 IPC Seat Manage19 0 122136 29608 99576 0 0 Entity MIB API 43 0 174464 52016 92584 0 0 rf proxy rp age10 0 23066464 22885360 74328 0 0 Exec
140 0 22752 27848 61016 0 0 HWIF QoS Proces276 0 192 192 61016 0 0 QM Timer ACL Pr
<SNIP>
Task ID for CEF process
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 204BRKRST-314314664_05_2008_c2
Show Process Memory Detailed
Using the Task ID from the previous output allows us to drill down further to get the program counter value
Cat6K#show processes memory detailed ios-base taskid 170System Memory : 524288K total, 282464K used, 241824K free, 1000K kernel reservedLowest(b) : 233308160Process sbin/ios-base, type IOS, PID = 24600
156592K total, 59376K text, 31412K data, 76K stack, 65728K dynamic
Memory Summary for TaskID = 170Holding = 432288
PC Size Count0x75A6C430 320056 10x75A5CE54 93280 200x75CBBDC8 6744 10x73D12644 6184 10x75A69EDC 3056 10x75A5CE20 1600 200x73D3479C 640 10x75CBBCFC 208 10x73D15324 192 10x73D40F48 168 10x73CA404C 160 2
PC 0x75A6C430 is the largest contributerPC value has to be
interpreted by the TAC
320056932906744618430561600640208192168
+ 160432288
Sum of sizes equals holding
value
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
103
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 205BRKRST-314314664_05_2008_c2
High CPU Utilization
Check high level CPU with show process cpu*
ios-base process is taking the majority
Cat6K#show process cpu | exclude 0.0CPU utilization for five seconds: 63%; one minute: 54%; five minutes: 50%PID 5Sec 1Min 5Min Process1 0.1% 0.3% 1.1% kernel 24600 55.6% 47.6% 43.2% ios-base24615 7.1% 6.0% 5.2% raw_ip.proc
* Use pipe option with exclude 0.0 to eliminate the irrelevant output
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 206BRKRST-314314664_05_2008_c2
High CPU Utilization
Now use show processes cpu detailed [process] to narrow down further
Cat6K#show processes cpu detailed ios-base | exclude 0.0CPU utilization for five seconds: 61%; one minute: 57%; five minutes: 53%PID/TID 5Sec 1Min 5Min Process Prio STATE CPU24600 52.7% 49.1% 45.2% ios-base 17m38s
1 1.9% 1.9% 1.7% 10 Receive 29.9614 4.9% 7.0% 6.7% 10 Receive 53.2407 15.6% 14.9% 13.9% 21 Intr 4m10s8 0.2% 0.2% 0.2% 22 Intr 85.812
12 10.8% 7.5% 4.6% 10 Reply 84.78813 8.2% 8.0% 6.5% 10 Receive 2m11s16 0.1% 1.3% 2.3% 10 Receive 88.12817 11.0% 8.0% 6.7% 10 Receive 98.316
Process sbin/ios-base, type IOS, PID = 24600CPU utilization for five seconds: 27%/25%; one minute: 24%; five minutes: 21%Task Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Task Name
2 76645 2597348 29 4.07% 4.01% 3.64% 0 Service Task 3 176849 2489254 71 13.19% 11.80% 10.34% 0 Service Task
11 40279 3829 10519 0.37% 0.13% 0.12% 0 Check heaps 126 126079 1311720 96 9.01% 7.95% 7.14% 0 IP Input
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
104
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 207BRKRST-314314664_05_2008_c2
Appendices: Reference Materials
QoS troubleshooting
WS-SUP32P (PISA) troubleshooting
Modular IOS troubleshooting
Monitoring the health of the system (GOLD/EEM)
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 208BRKRST-314314664_05_2008_c2
HW Installs, Moves, Changes
Deploying new hardware?
Hardware troubles most common during changes.Weekend chassis install.
Weekend config changes.
Late night line card replacement.
What can we do to make these evolutions less painful?
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
105
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 209BRKRST-314314664_05_2008_c2
SiSi
Generic Online DiagnosticsWhat Is Gold?
Gold defines a common framework fordiagnostics operations across Ciscoplatforms running Cisco IOS Software
Goal: check the health of hardware componentsand verify proper operation of the system dataplane and control plane at run-time and boot-time
Provides a common CLI and scheduling for fielddiagnostics including:
Bootup Tests (includes online insertion)Health Monitoring Tests (background non-disruptive)On-Demand Tests (disruptive and Non-disruptive)User Scheduled Tests (disruptive and Non-disruptive)CLI access to data via Management Interface
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 210BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsHow Does Gold Work?
Diagnostic packet switchingtests verify that the systemis operating correctly:
Is the supervisor control plane andforwarding plane functioning properly?
Is the standby supervisor ready totake over?
Are line cards forwarding packetsproperly?
Are all ports working?
Is the backplane connection working?
Other types of diagnostics testsincluding memory and errorcorrelation tests are also available
CPUForwarding Engine
Fabric
Forwarding Engine
Active Supervisor
Standby Supervisor
LineCard
LineCard
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
106
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 211BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsWhat Type of Failure Does Gold Detect?
Diagnostics capabilitiesbuilt in hardware
Depending on hardware,Gold can catch:
Port failure
Bent backplane connector
Bad fabric connection
Malfunctioning forwarding engines
Stuck control plane
Bad memory
—
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 212BRKRST-314314664_05_2008_c2
Switch(config)# diagnostic monitor module 5 test 2Switch(config)# diagnostic monitor interval module 5 test 2 00:00:15
Switch(config)# diagnostic bootup level complete
Switch# diagnostic start module 4 test 8Module 4: Running test(s) 8 may disrupt normal system operationDo you want to continue? [no]: ySwitch# diagnostic stop module 4
Switch(config)# diagnostic schedule module 4 test 1 port 3 on Jan 3 2005 23:32Switch(config)# diagnostic schedule module 4 test 2 daily 14:45
On-Demand
Health-Monitoring
Scheduled
Run During System Bootup, Line Card OIR Or Supervisor SwitchoverMakes Sure Faulty Hardware Is Taken out of Service
Non-Disruptive Tests Run in the BackgroundServes As HA Trigger
All diagnostics tests can be run on demand, for troubleshooting purposes. It can also be used as a pre-deployment tool.
Schedule Diagnostics Tests, for Verification and Troubleshooting Purposes
Boot-Up Diagnostics
Runtime Diagnostics
Generic Online DiagnosticsDiagnostic Operation
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
107
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 213BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsUsing Diagnostics as a Pre-Deployment Tool
Run diagnostics first on line cards, then on supervisors
Run packet switching tests first, run memory tests after
Switch# diagnostic start module 6 test allModule 6: Running test(s) 8 will require resetting the line card after the test has completedModule 6: Running test(s) 1-2,5-9 may disrupt normal system operationDo you want to continue? [no]: yes*Mar 25 22:43:16: %DIAG-SP-6-TEST_RUNNING: Module 6: Running TestTransceiverIntegrity{ID=1} ...*Mar 25 22:43:16: %DIAG-SP-3-TEST_SKIPPED: Module 6: TestTransceiverIntegrity{ID=1} is skipped*Mar 25 22:43:16: %LINK-5-CHANGED: Interface GigabitEthernet6/1, changed state to administratively down*Mar 25 22:43:16: %DIAG-SP-6-TEST_RUNNING: Module 6: Running TestLoopback{ID=2} ...*Mar 25 22:43:16: %DIAG-SP-6-TEST_RUNNING: Module 6: Running TestAsicMemory{ID=8} ...*Mar 25 22:43:16: SP: *******************************************************************Mar 25 22:43:16: SP: * WARNING:*Mar 25 22:43:16: SP: * ASIC Memory test on module 6 may take up to 2hr 30min.*Mar 25 22:43:16: SP: * During this time, please DO NOT perform any packet switching.*Mar 25 22:43:16: SP: ******************************************************************<snip>
Switch# diagnostic start module 5 test allModule 5: Running test(s) 27-30 will power-down line cards and standby supervisor should be power-down
manually and supervisor should be reset after the testModule 5: Running test(s) 26 will shut down the ports of all linecards and supervisor should be reset
after the testModule 5: Running test(s) 3,5,8-10,19,22-23,26-31 may disrupt normal system operationDo you want to continue? [no]: yes<snip>
The Order in Which Tests Are Run Matters
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 214BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsCatalyst Gold Operation ExampleSwitch# show diagnostic content mod 5
Module 5: Supervisor Engine 720 (Active)
<snip>
Testing Interval
ID Test Name Attributes (day hh:mm:ss.ms)
==== ================================== ============ =================
1) TestScratchRegister -------------> ***N****A*** 000 00:00:30.00
2) TestSPRPInbandPing --------------> ***N****A*** 000 00:00:15.00
3) TestTransceiverIntegrity --------> **PD****I*** not configured
4) TestActiveToStandbyLoopback -----> M*PDS***I*** not configured
5) TestLoopback --------------------> M*PD****I*** not configured
6) TestNewIndexLearn ---------------> M**N****I*** not configured
7) TestDontConditionalLearn --------> M**N****I*** not configured
8) TestBadBpduTrap -----------------> M**D****I*** not configured
9) TestMatchCapture ----------------> M**D****I*** not configured
10) TestProtocolMatchChannel --------> M**D****I*** not configured
11) TestFibDevices ------------------> M**N****I*** not configured
12) TestIPv4FibShortcut -------------> M**N****I*** not configured
13) TestL3Capture2 ------------------> M**N****I*** not configured
14) TestIPv6FibShortcut -------------> M**N****I*** not configured
15) TestMPLSFibShortcut -------------> M**N****I*** not configured
16) TestNATFibShortcut --------------> M**N****I*** not configured
17) TestAclPermit -------------------> M**N****I*** not configured
18) TestAclDeny ---------------------> M**N****A*** 000 00:00:05.00
19) TestQoSTcam ---------------------> M**D****I*** not configured
<snip>
Diagnostics test suite attributes:
M/C/* - Minimal bootup level test / Complete bootuplevel test / NA
B/* - Basic ondemand test / NA
P/V/* - Per port test / Per device test / NA
D/N/* - Disruptive test / Non-disruptive test / NA
S/* - Only applicable to standby unit / NA
X/* - Not a health monitoring test / NA
F/* - Fixed monitoring interval test / NA
E/* - Always enabled monitoring test / NA
A/I - Monitoring is active / Monitoring is inactive
R/* - Power-down line cards and need reset supervisor / NA
K/* - Require resetting the line card after the test has completed / NA
T/* - Shut down all ports and need reset supervisor / NA
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
108
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 215BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsCatalyst Gold Operation Example (Cont.)
20) TestL3VlanMet -------------------> M**N****I*** not configured n/a
21) TestIngressSpan -----------------> M**N****I*** not configured n/a
22) TestEgressSpan ------------------> M**D****I*** not configured n/a
23) TestNetflowInlineRewrite --------> C*PD****I*** not configured n/a
24) TestFabricSnakeForward ----------> M**N****I*** not configured n/a
25) TestFabricSnakeBackward ---------> M**N****I*** not configured n/a
26) TestTrafficStress ---------------> ***D****I**T not configured n/a
27) TestFibTcamSSRAM ----------------> ***D*X**IR** not configured n/a
28) TestAsicMemory ------------------> ***D*X**IR** not configured n/a
29) TestNetflowTcam -----------------> ***D*X**IR** not configured n/a
30) ScheduleSwitchover --------------> ***D****I*** not configured n/a
31) TestFirmwareDiagStatus ----------> M**N****I*** not configured n/a
32) TestAsicSync --------------------> ***N****A*** 000 00:00:15.00 10Diagnostics test suite attributes:
M/C/* - Minimal bootup level test / Complete bootuplevel test / NA
B/* - Basic ondemand test / NA
P/V/* - Per port test / Per device test / NA
D/N/* - Disruptive test / Non-disruptive test / NA
S/* - Only applicable to standby unit / NA
X/* - Not a health monitoring test / NA
F/* - Fixed monitoring interval test / NA
E/* - Always enabled monitoring test / NA
A/I - Monitoring is active / Monitoring is inactive
R/* - Power-down line cards and need reset supervisor / NA
K/* - Require resetting the line card after the test has completed / NA
T/* - Shut down all ports and need reset supervisor / NA
Pay extra attention to Memory tests:Memory tests can take hours to complete and a reset is required after running these tests!
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 216BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsCatalyst Gold Operation ExampleSwitch# show diagnostic result mod 7
Current bootup diagnostic level: complete
Module 7: CEF720 24 port 1000mb SFP
Overall Diagnostic Result for Module 7 : MINOR ERROR
Diagnostic level at card bootup: complete
Test results: (. = Pass, F = Fail, U = Untested)
1) TestTransceiverIntegrity:
Port 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
----------------------------------------------------------------------------
U U . U . . U U . . U U . . U U U U U U U U U U
2) TestLoopback:
Port 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
----------------------------------------------------------------------------
. . . . . . . . . . . . F . . . . . . . . . . .
3) TestScratchRegister -------------> .
4) TestSynchedFabChannel -----------> .
<snip>
Test results: (. = Pass, F = Fail, U = Untested)
© 2006, Cisco Systems, Inc. All rights reserved.14664_05_2008_c2.scr
109
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 217BRKRST-314314664_05_2008_c2
Generic Online DiagnosticsCatalyst Gold Operation Exampler1# show diagnostic description module 5 test ?
<1-33> Test ID NumberID Test Name [On-Demand Test Attributes]--- -------------------------------------------
1 TestScratchRegister [***N****]2 TestSPRPInbandPing [***N****]3 TestTransceiverIntegrity [**PD****]4 TestActiveToStandbyLoopback [M*PDS***]5 TestLoopback [M*PD****]6 TestNewIndexLearn [M**N****]
<snip>
r1# show diagnostic description module 5 test 2
TestSPRPInbandPing : By default, this test is enabled as health-monitoring test.The SP-RP Inband test catches most of the runtime software driverand hardware issues on supervisors. This is done by using diagnosticpacket tests exercising the layer 2 forwarding engine, the L3-4forwarding engine, and the replication engine along the path fromthe Switch Processor to the Route Processor.Packets are sent at an interval of 15 seconds and 10 consecutivefailures of the SP-RP Inband test result in failover to theredundant supervisor (default).
© 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 218BRKRST-314314664_05_2008_c2
SiSi
Generic Online DiagnosticsRecommendations
Bootup diagnostics: Set level to complete
On demand diagnostics:Use as a pre-deployment tool: run complete diagnosticsbefore putting hardware into production environmentUse as a troubleshooting tool when suspectinghardware failure
Scheduled diagnostics:Schedule key diagnostics tests periodicallySchedule all non-disruptive tests periodically
Health-monitoring diagnostics:Key tests running by defaultEnable additional non-disruptive tests for specific functionalities enabled in your network: IPv6, MPLS, NAT