I just recently had two HA pairs of Cisco ASA firewalls just stop communicating. A reboot of both the primary and secondary firewall in each HA pair resolved the problem. I had never observed such odd behavior from two pairs of Cisco ASA firewalls so I immediately suspected either a possible public exploit or a software bug given that both HA pairs were upgraded within the past 6-7 months.
Upon reviewing the 9.1.7 release notes from Cisco I stumbled over the following entry;
CSCvd78303 – ARP functions fail after 213 days of uptime, drop with error ‘punt-rate-limit-exceeded’
We upgraded a large number of our Cisco ASAs about 213 days ago to 9.1.(7)12 and now it would seem we’ll be upgrading again to 9.1.(7)19 – the actual issue is resolved in 9.1(7)16.
Here’s the text of the bug report;
Symptom:
An ASA, after reaching an uptime of roughly 213 days will fail to process ARP packets leading to a condition where all traffic eventually stops passing through the affected device. Since not all existing ARP entries time out at the same time, not all connections may fail at the same time.
Additional symptoms include:
- ASA does not have ARP entries in its ARP table. show arp is empty
- The output of show asp drop and ASP drop captures indicate a rapidly increasing counter for punt-rate-limit exceeded and the dropped packets are predominantly ARP.
IMAGES WITH FIXES
Images with fixes for this defect will be published as soon as they are available, and posted to the Cisco Software Download center.
Conditions:
This is seen when the ASA’s uptime reaches 213 days.
This problem affects ASA and FTD versions:
ASA version 9.1 releases 9.1(7)8 and higher
ASA version 9.2 releases 9.2(4)15 and higher
ASA version 9.4 releases 9.4(3)5 and higher including 9.4(4)
ASA version 9.5 releases 9.5(3) and higher
ASA version 9.6 releases 9.6(2)1 and higher including 9.6(3)
ASA version 9.7 releases 9.7(1) and higher
FTD version 6.1 releases 6.1.0.1 and higher
FTD version 6.2 releases 6.2.0Workaround:
Perform a pre-planned reboot of the device before approaching the 213 days (5124 hours) of up time. After the reboot, it will give you another 213 days of up time.
Further Problem Description:
Devices encountering this issue will not receive or respond to ARP packets. This affects not just transient traffic, but also access to the affected device – including Administration access such as SSH, HTTPS and Telnet. Console access is not affected.
If your running a Cisco ASA firewall I would recommend you check to make sure that you won’t be impacted by this bug.
Cheers!
Alex says
There was a field notice sent by Cisco. We upgraded the devices as soon as we got that email.
Michael McNamara says
Hi Alex,
Why the hell didn’t you tell me?
;)
Cheers!
Emerson Albuquerque says
Hi,
My customer reboot manually the ASA after the failover.
How we identify if the bug was the root cause to failover?
Theres a way to verify the uptime history before the reboot manually?
Regards,
Michael McNamara says
You can check the current uptime with the command “show version” from the CLI interface.
Cheers!
novice says
One early morning I receive a message that something has gone wrong in the network. I wasn’t able to VPN or connect to one of the servers in DMZ from home.
Worried what happened, rushed to work. Tried to login to ASA but couldn’t. Console was used and nothing was passed through the ASA. Issued show tech-support and it was show as turtle.
Eventually rebooted the firewall and service was brought on. The secondary ASA was not rebooted because we wanted few days gap between the up-time of both ASA. Hopefully the Firmware will be updated in few weeks time.