Here’s another short story detailing how a simple little change ended up being a bigger headache than I had planned.
It was a simple task, replace the SSL certificate for the discussion forums since it was soon expiring. Renewing the certificate from RapidSSL was a relatively easy task. I uploaded the new intermediate root and certificate files to the server, bundled them together into a single file and modified the Nginx configuration and proceeded to restart the Nginx process. Oddly enough after I restarted the Nginx process I got an “ERR_CONNECTION_REFUSED” from Chrome. A quick test via cURL provided the same result, “connection refused“. I backed out the configuration change and restarted Nginx only to still have the same problem. I thought, “now that is very odd indeed”. I had backed out the configuration change yet I was still having an issue. I quickly realized that the problem was impacting all the websites I managed on that specific server and it appeared that any HTTP or HTTPS connections were getting refused, I confirmed this from a packet trace by observing a TCP reset packet being sent by the server upon receipt of a SYN packet from the client. I checked to see that Nginx was listening on TCP/80 and TCP/443 and it was listening on both ports [Example: lsof -i / netstat -an]. I got a hint when I checked the IPv6 address using cURL and got a response. Nginx was answering IPv6 requests but essentially ignoring IPv4 requests. Something else must have changed outside of the simple certificate configuration change that I had already rolled back.
A quick look at yum revealed that Nginx was updated back on September 10th, that was a significant find.
[root@centos ~]# yum history
Loaded plugins: fastestmirror
ID | Login user | Date and time | Action(s) | Altered
-------------------------------------------------------------------------------
42 | root | 2016-09-10 09:45 | I, U | 36 EE
41 | root | 2016-05-26 10:50 | I, U | 137 EE
40 | root | 2016-03-13 12:38 | Update | 19
39 | root | 2016-02-06 07:09 | Update | 10
38 | root | 2015-12-25 09:31 | Update | 39
[root@centos ~]# yum history info 42
Loaded plugins: fastestmirror
Transaction ID : 42
Begin time : Sat Sep 10 09:45:34 2016
Begin rpmdb : 378:4a758e818516d25c4ae06da426d3b43ef7f5624a
End time : 09:45:51 2016 (17 seconds)
End rpmdb : 385:489132724582a1c351cce4cf9ac0efa1a7fe4898
User : root
Return-Code : Success
Command Line : update
Transaction performed with:
Installed rpm-4.8.0-55.el6.i686 @base
Installed yum-3.2.29-73.el6.centos.noarch @base
Installed yum-plugin-fastestmirror-1.1.30-37.el6.noarch @base
Packages Altered:
Updated GeoIP-GeoLite-data-2015.12-1.el6.noarch @epel
Update 2016.07-1.el6.noarch @epel
Updated GeoIP-GeoLite-data-extra-2015.12-1.el6.noarch @epel
Update 2016.07-1.el6.noarch @epel
Updated avahi-libs-0.6.25-15.el6.i686 @base
Update 0.6.25-15.el6_8.1.i686 @updates
Updated cronie-1.4.4-15.el6_7.1.i686 @base
Update 1.4.4-16.el6_8.2.i686 @updates
Updated cronie-anacron-1.4.4-15.el6_7.1.i686 @base
Update 1.4.4-16.el6_8.2.i686 @updates
Updated httpd-2.2.15-53.el6.centos.i686 @base
Update 2.2.15-54.el6.centos.i686 @updates
Updated httpd-tools-2.2.15-53.el6.centos.i686 @base
Update 2.2.15-54.el6.centos.i686 @updates
Updated initscripts-9.03.53-1.el6.centos.i686 @base
Update 9.03.53-1.el6.centos.1.i686 @updates
Updated innotop-1.10.0-0.3.81da83f.el6.noarch @epel
Update 1.11.1-1.el6.noarch @epel
Updated libtiff-3.9.4-10.el6_5.i686 @base
Update 3.9.4-18.el6_8.i686 @updates
Updated libxml2-2.7.6-21.el6.i686 @base
Update 2.7.6-21.el6_8.1.i686 @updates
Updated libxml2-python-2.7.6-21.el6.i686 @base
Update 2.7.6-21.el6_8.1.i686 @updates
Updated nginx-1.0.15-12.el6.i686 @epel
Update 1.10.1-1.el6.i686 @epel
Dep-Install nginx-all-modules-1.10.1-1.el6.noarch @epel
Updated nginx-filesystem-1.0.15-12.el6.noarch @epel
Update 1.10.1-1.el6.noarch @epel
Dep-Install nginx-mod-http-geoip-1.10.1-1.el6.i686 @epel
Dep-Install nginx-mod-http-image-filter-1.10.1-1.el6.i686 @epel
Dep-Install nginx-mod-http-perl-1.10.1-1.el6.i686 @epel
Dep-Install nginx-mod-http-xslt-filter-1.10.1-1.el6.i686 @epel
Dep-Install nginx-mod-mail-1.10.1-1.el6.i686 @epel
Dep-Install nginx-mod-stream-1.10.1-1.el6.i686 @epel
Updated nss-softokn-3.14.3-23.el6_7.i686 @base
Update 3.14.3-23.3.el6_8.i686 @updates
Updated nss-softokn-freebl-3.14.3-23.el6_7.i686 @base
Update 3.14.3-23.3.el6_8.i686 @updates
Updated php-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated php-cli-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated php-common-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated php-fpm-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated php-gd-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated php-mysql-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated php-pdo-5.3.3-47.el6.i686 @base
Update 5.3.3-48.el6_8.i686 @updates
Updated python-2.6.6-64.el6.i686 @base
Update 2.6.6-66.el6_8.i686 @updates
Updated python-libs-2.6.6-64.el6.i686 @base
Update 2.6.6-66.el6_8.i686 @updates
Updated tar-2:1.23-14.el6.i686 @base
Update 2:1.23-15.el6_8.i686 @updates
Updated tzdata-2016d-1.el6.noarch @updates
Update 2016f-1.el6.noarch @updates
Updated udev-147-2.73.el6.i686 @base
Update 147-2.73.el6_8.2.i686 @updates
Updated yum-3.2.29-73.el6.centos.noarch @base
Update 3.2.29-75.el6.centos.noarch @updates
Scriptlet output:
1 warning: /etc/nginx/conf.d/default.conf created as /etc/nginx/conf.d/default.conf.rpmnew
2 warning: /etc/nginx/nginx.conf created as /etc/nginx/nginx.conf.rpmnew
history info
This was the first time I had restarted Nginx since the update back in September, and that was the key to unlocking the mystery. I tried backing out the update
yum history undo 42
but that left me without Nginx installed at all. I suspected something changed in Nginx with the update, I know that the server was responding to IPv6 requests but not IPv4 requests so I started looking at the configuration files for the virtual hosts and quickly focused on my use of a single listen directive for both IPv4 and IPv6.
listen [::]:80;
I looked back at the server logs and determined that Nginx was upgraded from 1.0.15-5 to 1.10.1 back in September. It turns out that as of 1.3.4, the ipv6only directive is enabled by default which disables IPv4. While doing some research I also stumbled across an article from Michael Hughes titled ‘Nginx ipv6only setting gotcha‘ which described the same issue I was experiencing.
I adjusted the configuration of my virtual hosts by using the following;
listen 80;
listen [::]:80;
I had planned to spend about 30 minutes replacing the SSL certificate, after almost 2 hours of downtime I finally managed to get the websites up and running again. This is par for the norm working in Information Technology, you usually need to be a part-time detective to figure out what broke before you can fix anything. I eventually got back around to replacing the SSL certificate and that worked without issue.
Cheers!
Sounds very complicated! Hope it doesn’t happen again.
Very good example of analytical skills being jut as important as technical ones.
I often find myself in situations that a small change brings up lots of problems, defying any logic (at first) until you figure it out!
Thanks for the comment Marcel!