Here’s a short story, partially still in progress. With all the security breaches going on I thought for a few moments that I might have been caught up in one of them. This past Thursday night I noticed that both my virtual servers, which are hosted by Digital Ocean, were very slow to login via SSH, I mean extremely slow on the order of 8 to 10 seconds. Accessing the WordPress installation on one of those virtual servers was also extremely slow. I began to fear that one of my servers had fallen victim to some vulnerability or exploit and because I have shared ssh keys that might mean that both servers could have possibly been accessed without my knowledge or permission.
I checked that both DNS servers were responding and resolving queries;
/etc/resolv.conf: nameserver 22.214.171.124 nameserver 126.96.36.199
I was able to resolve www.google.com from both 188.8.131.52 as well as 184.108.40.206 so I initially thought the problem wasn’t with DNS. However, when I was unable to figure out where the problem was I decided to disable DNS resolution within SSH by placing the following statement in /etc/ssh/sshd_config;
A quick restart of sshd (service sshd restart) and the problem seemed to be resolved.. but what had fixed it?
I went back to the /etc/resolv.conf file and decided to change the order of the DNS servers, placing Google’s DNS server ahead of Level3 and adding a second Google DNS server into the mix. Digital Ocean doesn’t maintain their own DNS infrastructure which is somewhat surprising for such a large provider.
/etc/resolv.conf nameserver 220.127.116.11 nameserver 18.104.22.168 nameserver 22.214.171.124
And to my surprise when I went back to test the problem was gone? So was there some problem with 126.96.36.199?
With that problem partially explained I decided to apply the latest and greatest patches and security updates for CentOS 6.5. And there I also ran into any problem… upon rebooting I found that the OS was unable load the driver for the network interface eth0, returning the error;
FATAL: Could not load /lib/modules/2.6.32-358.6.2.el6.x86_64/modules.dep
I was able to quickly change my kernel version via the Digital Ocean control panel to 2.6.32-431.20.3.el6.x86_64 and reboot.
With all that done I turned my attention toward my WordPress administration which was still extremely slow. I had been guessing that the problem was probably related to some plugin that I recently updated so I went back and disabled all the plugins to see if I could possibly find the one that was causing the slow down. That search proved fruitless so I turn my attention to the performance of Nginx, PHP-FPM and MySQL, After optimizing the tables and some of the configurations within MySQL I found the response of the WordPress Admin portal better (3-5 seconds) but there’s still a problem somewhere there that I need to track down.
Here are a few resources if you are struggling with tunning your MySQL instance;
- MySQL Tuning Primer Script by Matthew Montgomery
- MySQLTuner-perl by Major Hayden
- mysqlfragfinder by Phil Dufault
It’s never a boring day blogging on the Internet.
Adam Mini says
I have always had intermittent issues with 188.8.131.52 (not sure why). Because of that, I usually test with that because it’s easy to remember but never count on the results to be flawless. I use that server(s) as tertiary for me.
I had the same issue with 184.108.40.206, which appeared on August 20th. Here’s another recent report http://codepie.org/tag/dns. There’s a maintenance entry in http://status.digitalocean.com/ and it seems they have broken something.