Here’s another short story detailing how a simple little change ended up being a bigger headache than I had planned.
It was a simple task, replace the SSL certificate for the discussion forums since it was soon expiring. Renewing the certificate from RapidSSL was a relatively easy task. I uploaded the new intermediate root and certificate files to the server, bundled them together into a single file and modified the Nginx configuration and proceeded to restart the Nginx process. Oddly enough after I restarted the Nginx process I got an “ERR_CONNECTION_REFUSED” from Chrome. A quick test via cURL provided the same result, “connection refused“. I backed out the configuration change and restarted Nginx only to still have the same problem. I thought, “now that is very odd indeed”. I had backed out the configuration change yet I was still having an issue. I quickly realized that the problem was impacting all the websites I managed on that specific server and it appeared that any HTTP or HTTPS connections were getting refused, I confirmed this from a packet trace by observing a TCP reset packet being sent by the server upon receipt of a SYN packet from the client. I checked to see that Nginx was listening on TCP/80 and TCP/443 and it was listening on both ports [Example: lsof -i / netstat -an]. I got a hint when I checked the IPv6 address using cURL and got a response. Nginx was answering IPv6 requests but essentially ignoring IPv4 requests. Something else must have changed outside of the simple certificate configuration change that I had already rolled back.
A quick look at yum revealed that Nginx was updated back on September 10th, that was a significant find.
[root@centos ~]# yum history Loaded plugins: fastestmirror ID | Login user | Date and time | Action(s) | Altered ------------------------------------------------------------------------------- 42 | root | 2016-09-10 09:45 | I, U | 36 EE 41 | root | 2016-05-26 10:50 | I, U | 137 EE 40 | root | 2016-03-13 12:38 | Update | 19 39 | root | 2016-02-06 07:09 | Update | 10 38 | root | 2015-12-25 09:31 | Update | 39 [root@centos ~]# yum history info 42 Loaded plugins: fastestmirror Transaction ID : 42 Begin time : Sat Sep 10 09:45:34 2016 Begin rpmdb : 378:4a758e818516d25c4ae06da426d3b43ef7f5624a End time : 09:45:51 2016 (17 seconds) End rpmdb : 385:489132724582a1c351cce4cf9ac0efa1a7fe4898 User : root Return-Code : Success Command Line : update Transaction performed with: Installed rpm-4.8.0-55.el6.i686 @base Installed yum-3.2.29-73.el6.centos.noarch @base Installed yum-plugin-fastestmirror-1.1.30-37.el6.noarch @base Packages Altered: Updated GeoIP-GeoLite-data-2015.12-1.el6.noarch @epel Update 2016.07-1.el6.noarch @epel Updated GeoIP-GeoLite-data-extra-2015.12-1.el6.noarch @epel Update 2016.07-1.el6.noarch @epel Updated avahi-libs-0.6.25-15.el6.i686 @base Update 0.6.25-15.el6_8.1.i686 @updates Updated cronie-1.4.4-15.el6_7.1.i686 @base Update 1.4.4-16.el6_8.2.i686 @updates Updated cronie-anacron-1.4.4-15.el6_7.1.i686 @base Update 1.4.4-16.el6_8.2.i686 @updates Updated httpd-2.2.15-53.el6.centos.i686 @base Update 2.2.15-54.el6.centos.i686 @updates Updated httpd-tools-2.2.15-53.el6.centos.i686 @base Update 2.2.15-54.el6.centos.i686 @updates Updated initscripts-9.03.53-1.el6.centos.i686 @base Update 9.03.53-1.el6.centos.1.i686 @updates Updated innotop-1.10.0-0.3.81da83f.el6.noarch @epel Update 1.11.1-1.el6.noarch @epel Updated libtiff-3.9.4-10.el6_5.i686 @base Update 3.9.4-18.el6_8.i686 @updates Updated libxml2-2.7.6-21.el6.i686 @base Update 2.7.6-21.el6_8.1.i686 @updates Updated libxml2-python-2.7.6-21.el6.i686 @base Update 2.7.6-21.el6_8.1.i686 @updates Updated nginx-1.0.15-12.el6.i686 @epel Update 1.10.1-1.el6.i686 @epel Dep-Install nginx-all-modules-1.10.1-1.el6.noarch @epel Updated nginx-filesystem-1.0.15-12.el6.noarch @epel Update 1.10.1-1.el6.noarch @epel Dep-Install nginx-mod-http-geoip-1.10.1-1.el6.i686 @epel Dep-Install nginx-mod-http-image-filter-1.10.1-1.el6.i686 @epel Dep-Install nginx-mod-http-perl-1.10.1-1.el6.i686 @epel Dep-Install nginx-mod-http-xslt-filter-1.10.1-1.el6.i686 @epel Dep-Install nginx-mod-mail-1.10.1-1.el6.i686 @epel Dep-Install nginx-mod-stream-1.10.1-1.el6.i686 @epel Updated nss-softokn-3.14.3-23.el6_7.i686 @base Update 3.14.3-23.3.el6_8.i686 @updates Updated nss-softokn-freebl-3.14.3-23.el6_7.i686 @base Update 3.14.3-23.3.el6_8.i686 @updates Updated php-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated php-cli-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated php-common-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated php-fpm-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated php-gd-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated php-mysql-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated php-pdo-5.3.3-47.el6.i686 @base Update 5.3.3-48.el6_8.i686 @updates Updated python-2.6.6-64.el6.i686 @base Update 2.6.6-66.el6_8.i686 @updates Updated python-libs-2.6.6-64.el6.i686 @base Update 2.6.6-66.el6_8.i686 @updates Updated tar-2:1.23-14.el6.i686 @base Update 2:1.23-15.el6_8.i686 @updates Updated tzdata-2016d-1.el6.noarch @updates Update 2016f-1.el6.noarch @updates Updated udev-147-2.73.el6.i686 @base Update 147-2.73.el6_8.2.i686 @updates Updated yum-3.2.29-73.el6.centos.noarch @base Update 3.2.29-75.el6.centos.noarch @updates Scriptlet output: 1 warning: /etc/nginx/conf.d/default.conf created as /etc/nginx/conf.d/default.conf.rpmnew 2 warning: /etc/nginx/nginx.conf created as /etc/nginx/nginx.conf.rpmnew history info
This was the first time I had restarted Nginx since the update back in September, and that was the key to unlocking the mystery. I tried backing out the update
yum history undo 42
but that left me without Nginx installed at all. I suspected something changed in Nginx with the update, I know that the server was responding to IPv6 requests but not IPv4 requests so I started looking at the configuration files for the virtual hosts and quickly focused on my use of a single listen directive for both IPv4 and IPv6.
I looked back at the server logs and determined that Nginx was upgraded from 1.0.15-5 to 1.10.1 back in September. It turns out that as of 1.3.4, the ipv6only directive is enabled by default which disables IPv4. While doing some research I also stumbled across an article from Michael Hughes titled ‘Nginx ipv6only setting gotcha‘ which described the same issue I was experiencing.
I adjusted the configuration of my virtual hosts by using the following;
listen 80; listen [::]:80;
I had planned to spend about 30 minutes replacing the SSL certificate, after almost 2 hours of downtime I finally managed to get the websites up and running again. This is par for the norm working in Information Technology, you usually need to be a part-time detective to figure out what broke before you can fix anything. I eventually got back around to replacing the SSL certificate and that worked without issue.
Jessica McNamara says
Sounds very complicated! Hope it doesn’t happen again.
Marcel Stürtz says
Very good example of analytical skills being jut as important as technical ones.
I often find myself in situations that a small change brings up lots of problems, defying any logic (at first) until you figure it out!
Michael McNamara says
Thanks for the comment Marcel!