Who doesn’t love a good mystery, I’m no exception. A few weeks back we had an interesting issue pop-up. It was midnight on a Sunday night and PagerDuty started firing off an alert that a UPS in one of our distribution centers had just stopped responding via HTTPS. The UPS was still online responding to both ICMP and SNMP traffic, so the alert was acknowledged and the alarm was paused until it could be reviewed the next day.
The UPS itself was fairly new having been installed just under a year ago. It was an Schneider Electric/APC 3000RT Smart-UPS with an AP9631 network management card. Interestingly enough we just had an issue with a brand new APC 8000SRT Smart-UPS with an integrated AP9537SUM network management card that had essentially started doing the same exact thing a few days earlier. Only that installation was only a few days old when it stopped working. Again ICMP and SNMP worked fine… as did HTTP (if you enabled it).
What was that all about?
After a few hours of troubleshooting and digging I discovered that the self-signed SSL certificate installed on the NMC had expired. Any attempt to connect to the NMC via HTTPS after that point would result in the socket getting immediately closed upon connecting by the NMC. Removing the self-signed SSL certificate and rebooting the NMC caused the self-signed SSL certificate to be regenerated and the problem was resolved. You can remove the SSL certificate by enabling the HTTP server via either SSH or TELNET (will depend on the age of your card as to which one is enabled by default), login in via HTTP go to Configuration -> Network -> Web -> SSL Certificate and select Remove and Apply. You just need to reboot the NMC and you should be able to connect via HTTPS.
The self-signed SSL certificate is only good for one year, after which you’ll need to regenerate it again. The latest version of the firmware/software (NMC2 – v7.0.4) from APC sets the expiration date for all self-signed SSL certificates out to 2035 – not sure if the web browsers will start to complain about that.
If you haven’t already patched your APC network management cards it might be a good time to take care of that task as well. We had to patch all of our APC and Eaton network management cards that are used throughout our network.
Thanks a lot Michael, this was a real PITA for me. Your post saved me a ton of headaches!
John Walsh says
Thanks for this! I just had 6 devices with exactly this problem.
Vince Valenti says
You can also delete the certificate via SSH:
Reboot Management Interface
Enter ‘YES’ to continue or to cancel : YES