Super-Networking Blog

Performance Issues on your 7600/6500 Series Cisco Devices

by admin on Oct.16, 2006, under Networking, Routers

We have been running into performance issues in our datacenter lately that doesn’t show many symptoms except that things just don’t seem to be running fast enough. Mostly I have had a feeling a while that something wasn’t right. We have also been having slowness issues with our NAS, in packet captures we have seen a lot of dropped packets and retransmits.

Here is what I have seen from Cisco on the issue:

Interface/Module Connectivity Problems

Connectivity Problem or Packet Loss with WS-X6548-GE-TX and WS-X6148-GE-TX Modules used in a Server Farm

When you use either the WS-X6548-GE-TX or WS-X6148-GE-TX modules, there is a possibility that individual port utilization can lead to connectivity problems or packet loss on the surrounding interfaces. Especially when you use EtherChannel and Remote Switched Port Analyzer (RSPAN) in these line cards, you can potentially see the slow response due to packet loss. These line cards are oversubscription cards that are designed to extend gigabit to the desktop and might not be ideal for server farm connectivity. On these modules there is a single 1-Gigabit Ethernet uplink from the port ASIC that supports eight ports. These cards share a 1 Mb buffer between a group of ports (1-8, 9-16, 17-24, 25-32, 33-40, and 41-48) since each block of eight ports is 8:1 oversubscribed. The aggregate throughput of each block of eight ports cannot exceed 1 Gbps. Table 4 in the Cisco Catalyst 6500 Series 10/100- & 10/100/1000-Mbps Ethernet Interface Modules shows the different types of Ethernet interface modules and the supported buffer size per port.

Oversubscription happens due to multiple ports combined into a single Pinnacle ASIC. The Pinnacle ASIC is a direct memory access (DMA) engine that transfers packets between backplane switching bus and the network ports. If any port in this range receives or transmits traffic at a rate that exceeds its bandwidth or utilizes a large amount of buffers to handle bursts of traffic, the other ports in the same range can potentially experience packet loss. The buffer assignment on these modules is documented in Buffers, Queues & Thresholds on Catalyst 6500 Ethernet Modules.

A SPAN destination is a very common cause since it is not uncommon to copy traffic from an entire VLAN or multiple ports to a single interface. On a card with individual interface buffers, the packets that exceed the bandwidth of the destination port are silently dropped and no other ports are affected. With a shared buffer, this causes connectivity problems for the other ports on this range. In most scenarios, shared buffers do not result in any problems. Even with eight gigabit attached workstations, it is rare that the provided bandwidth is exceeded.

The WS-X6548-GE-TX, WS-X6548V-GE-TX, WS-X6148-GE-TX, and WS-X6148V-GE-TX modules have a limitation with EtherChannel. For EtherChannel, the data from all links in a bundle goes to the port ASIC, even though the data is destined for another link. This data consumes bandwidth in the 1-Gigabit Ethernet link. For these modules, the sum total of all data on an EtherChannel cannot exceed 1 Gigabit.

Check this output in order to verify that the module experiences drops related to over utilized buffers:

  • CatOSCat6500 (enable) show asicreg pinnacle errCheck this output in the list of registers. If the settings in this output are non-zero, it indicates that there were drops due to the buffer overrun.015B: PI_PBT_S_QOS3_OUTLOST_REG = 0011

    015F: PI_PBT_S_HOLD_REG = D26C

  • NativeIOSCat6500# show counters interface gigabitEthernet | include qos3Outlost51. qos3Outlost = 768504851

Run the show commands several times to check if asicreg steadily increments. The asicreg outputs are cleared every time they are run. If the asicreg outputs remain non-zero then this indicates active drops. Based on the rate of traffic, this data might need to be collected over several minutes in order to get significant increments.

Workaround

Complete these steps:

  1. Isolate any ports that might be consistently oversubscribed to their own range of ports in order to minimize the impact of drops to other interfaces.For example, if you have a server connected to port 1 which is oversubscribing the interface, this can lead to slow response if you have several other servers connected to the ports in the range 2-8. In this case, move the oversubscribing server to port 9 in order to free up the buffer in the first block of ports 1-8. On newer software versions, SPAN destinations have the buffering automatically moved to the interface so it does not impact the other ports in its range. Refer to Cisco bug IDs CSCed25278 ( registered customers only) (CatOS) and CSCin70308 ( registered customers only) (NativeIOS) for more information.
  2. Disable head of line blocking (HOL) which utilizes the interface buffers instead of the shared buffers.This results in only the single over utilized port having drops. Since the interface buffers (32 k) are significantly smaller than the 1 Mb shared buffer, there can potentially be more packet loss on the individual ports. This is only recommended for extreme cases where slower clients or SPAN ports cannot be moved to the other line cards that offer dedicated interface buffers.
    • NativeIOSRouter(config)# interface gigabitethernet Router(config-if)# hol-blocking disableOnce this is disabled, the drops move to the interface counters and can be seen with the show interface gigabit command. The other ports are no longer affected provided that they are also not individually bursting. Since it is recommended to keep HOL blocking enabled, this information can be used to find the device that overruns the buffers on the range of ports and move it to another card or an isolated range on the card so HOL blocking can be re-enabled.
    • CatOSConsole> (enable) set port hol-blocking disableOnce this is disabled, the drops move to the interface counters and can be seen with the show mac command. The other ports are no longer affected provided that they are not also individually bursting. Since it is recommended to keep HOL blocking enabled, this information can be used to find the device that overruns the buffers on the range of ports and move it to another card or an isolated range on the card so HOL blocking can be re-enabled.

Also here is a document http://super-networking.net/downloads/6500seriesmodules.pdf on what switch modules to use for what on your 6500 and 7600 series Cisco devices.


3,017 views

3 Comments for this entry

1 Trackback or Pingback for this entry

Leave a Reply

Security Code:

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Your Ad Here