I just recently had two HA pairs of Cisco ASA firewalls just stop communicating. A reboot of both the primary and secondary firewall in each HA pair resolved the problem. I had never observed such odd behavior from two pairs of Cisco ASA firewalls so I immediately suspected either a possible public exploit or a software bug given that both HA pairs were upgraded within the past 6-7 months.
Upon reviewing the 9.1.7 release notes from Cisco I stumbled over the following entry;
CSCvd78303 – ARP functions fail after 213 days of uptime, drop with error ‘punt-rate-limit-exceeded’
We upgraded a large number of our Cisco ASAs about 213 days ago to 9.1.(7)12 and now it would seem we’ll be upgrading again to 9.1.(7)19 – the actual issue is resolved in 9.1(7)16.
Here’s the text of the bug report;
An ASA, after reaching an uptime of roughly 213 days will fail to process ARP packets leading to a condition where all traffic eventually stops passing through the affected device. Since not all existing ARP entries time out at the same time, not all connections may fail at the same time.
Additional symptoms include:
- ASA does not have ARP entries in its ARP table. show arp is empty
- The output of show asp drop and ASP drop captures indicate a rapidly increasing counter for punt-rate-limit exceeded and the dropped packets are predominantly ARP.
IMAGES WITH FIXES
Images with fixes for this defect will be published as soon as they are available, and posted to the Cisco Software Download center.
This is seen when the ASA’s uptime reaches 213 days.
This problem affects ASA and FTD versions:
ASA version 9.1 releases 9.1(7)8 and higher
ASA version 9.2 releases 9.2(4)15 and higher
ASA version 9.4 releases 9.4(3)5 and higher including 9.4(4)
ASA version 9.5 releases 9.5(3) and higher
ASA version 9.6 releases 9.6(2)1 and higher including 9.6(3)
ASA version 9.7 releases 9.7(1) and higher
FTD version 6.1 releases 126.96.36.199 and higher
FTD version 6.2 releases 6.2.0
Perform a pre-planned reboot of the device before approaching the 213 days (5124 hours) of up time. After the reboot, it will give you another 213 days of up time.
Further Problem Description:
Devices encountering this issue will not receive or respond to ARP packets. This affects not just transient traffic, but also access to the affected device – including Administration access such as SSH, HTTPS and Telnet. Console access is not affected.
If your running a Cisco ASA firewall I would recommend you check to make sure that you won’t be impacted by this bug.