Things certainly went south for Facebook today in a spectacular way as Reddit and other forums lit up with posts about Facebook, Instagram and WhatsApp being down and unreachable. Someone asked me a simple question? How do you troubleshoot an outage like that? We’re obviously limited as “outsiders” but even as a regular netizen we can do a bit of investigative troubleshooting to get some idea of what’s going on at Facebook.
If you tried to visit Facebook earlier today you would have likely seen this message in your web browser.
This site can’t be reached
www.facebook.com’s server IP address count not be found.
Let’s start with the basics…. DNS resolution.
[root@woodstock ~]# dig facebook.com +short
[root@woodstock ~]#
That’s not good… we can’t get an IP address for facebook.com, let’s try www.facebook.com as well.
[root@woodstock ~]# dig www.facebook.com +short
[root@woodstock ~]#
Ok, equally bad… let’s try to find the authoritative DNS servers for the domain facebook.com. We know from experience that a.gtld-servers.net. is a top level DNS server for the .com TLD, but let’s confirm it’s still in the list of servers. (I’ll edit the output below to help save space and focus our attention)
[root@woodstock ~]# dig ns com ;; ANSWER SECTION: com. 170780 IN NS b.gtld-servers.net. com. 170780 IN NS i.gtld-servers.net. com. 170780 IN NS m.gtld-servers.net. com. 170780 IN NS j.gtld-servers.net. com. 170780 IN NS l.gtld-servers.net. com. 170780 IN NS e.gtld-servers.net. com. 170780 IN NS k.gtld-servers.net. com. 170780 IN NS h.gtld-servers.net. com. 170780 IN NS g.gtld-servers.net. com. 170780 IN NS d.gtld-servers.net. com. 170780 IN NS c.gtld-servers.net. com. 170780 IN NS a.gtld-servers.net. com. 170780 IN NS f.gtld-servers.net. ;; ADDITIONAL SECTION: a.gtld-servers.net. 69518 IN A 192.5.6.30 b.gtld-servers.net. 82780 IN A 192.33.14.30 c.gtld-servers.net. 84678 IN A 192.26.92.30 d.gtld-servers.net. 84679 IN A 192.31.80.30 e.gtld-servers.net. 84678 IN A 192.12.94.30 f.gtld-servers.net. 84138 IN A 192.35.51.30 g.gtld-servers.net. 84679 IN A 192.42.93.30 h.gtld-servers.net. 84678 IN A 192.54.112.30 i.gtld-servers.net. 84679 IN A 192.43.172.30 j.gtld-servers.net. 82780 IN A 192.48.79.30 k.gtld-servers.net. 84679 IN A 192.52.178.30 l.gtld-servers.net. 84138 IN A 192.41.162.30 m.gtld-servers.net. 84679 IN A 192.55.83.30 a.gtld-servers.net. 81113 IN AAAA 2001:503:a83e::2:30
Ok, so a.gtld-servers.net is still in there… so let’s ask that DNS server who are the DNS servers for the domain facebook.com.
[root@woodstock ~]# dig @a.gtld-servers.net. ns facebook.com
;; QUESTION SECTION:
;facebook.com. IN NS
;; AUTHORITY SECTION:
facebook.com. 172800 IN NS a.ns.facebook.com.
facebook.com. 172800 IN NS b.ns.facebook.com.
facebook.com. 172800 IN NS c.ns.facebook.com.
facebook.com. 172800 IN NS d.ns.facebook.com.
;; ADDITIONAL SECTION:
a.ns.facebook.com. 172800 IN A 129.134.30.12
a.ns.facebook.com. 172800 IN AAAA 2a03:2880:f0fc:c:face:b00c:0:35
b.ns.facebook.com. 172800 IN A 129.134.31.12
b.ns.facebook.com. 172800 IN AAAA 2a03:2880:f0fd:c:face:b00c:0:35
c.ns.facebook.com. 172800 IN A 185.89.218.12
c.ns.facebook.com. 172800 IN AAAA 2a03:2880:f1fc:c:face:b00c:0:35
d.ns.facebook.com. 172800 IN A 185.89.219.12
d.ns.facebook.com. 172800 IN AAAA 2a03:2880:f1fd:c:face:b00c:0:35
There are the DNS servers for the domain facebook.com, so let’s see if we can communicate with any of them.
Let’s start by pinging the servers (for brevity I’m only going to go through the first server above… but they all were having issues today)
[root@woodstock ~]# ping a.ns.facebook.com -c 5 -q
PING a.ns.facebook.com (129.134.30.12) 56(84) bytes of data.
--- a.ns.facebook.com ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 3999ms
That’s not completely unexpected as most networks today block ICMP traffic by default to prevent DoS attacks so let’s try a simple DNS query to that server.
[root@woodstock ~]# dig @a.ns.facebook.com ns facebook.com
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.5 <<>> @a.ns.facebook.com ns facebook.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
That’s definitely not good, so we can assume at this point that we’re unable to communicate with the DNS servers for the facebook.com domain name, hence the error message we’re gettting in the web browser. But let’s dig a little deeper to see if the IP networks that are associated with those DNS servers are “online” and reachable. We can do that by looking at a BGP looking glass or full BGP routing table and see if that prefix is being advertised, we can also try to traceroute to the IP address in question and see if we can reach the Facebook network.
Let’s use WHOIS to see what network that IP address is a member of (again I’ve cut out some of the output below).
[root@woodstock ~]# whois 129.134.30.12
[Querying whois.arin.net]
[whois.arin.net]
NetRange: 129.134.0.0 - 129.134.255.255
CIDR: 129.134.0.0/16
NetName: THEFA-3
NetHandle: NET-129-134-0-0-1
Parent: NET129 (NET-129-0-0-0-0)
NetType: Direct Assignment
OriginAS:
Organization: Facebook, Inc. (THEFA-3)
RegDate: 2015-05-13
Updated: 2015-05-13
Ref: https://rdap.arin.net/registry/ip/129.134.0.0
Ok, so the original netblock assigned to Facebook from ARIN was 129.134.0.0/16 but Facebook could have subnetted that so we need to mindful that it could be smaller than the /16 we see allocated above.
There was a mention in some of the forums that all BGP peers to Facebook were down, so let’s check there. Let’s look at the Hurricane Electric’s Network Looking Glass using the IP address of 129.134.30.12. That shows us the following (as of 5:00PM EDT Monday October 4, 2021).
core1.mnz1.he.net> show ip bgp routes detail 129.134.30.12
Number of BGP Routes matching display condition : 2
S:SUPPRESSED F:FILTERED s:STALE x:BEST-EXTERNAL
1 Prefix: 129.134.0.0/17, Rx path-id:0x00000000, Tx path-id:0x00000001, rank:0x00000001, Status: BI, Age: 28d7h21m27s
NEXT_HOP: 65.49.109.182, Metric: 1486, Learned from Peer: 216.218.252.172 (6939)
LOCAL_PREF: 100, MED: 0, ORIGIN: igp, Weight: 0, GROUP_BEST: 1
AS_PATH: 3491 32934
COMMUNITIES: 6939:1111 6939:7039 6939:8392 6939:9003
2 Prefix: 129.134.0.0/17, Rx path-id:0x00000000, Tx path-id:0x00040001, rank:0x00000002, Status: Ex, Age: 86d22h8m40s
NEXT_HOP: 62.115.42.144, Metric: 0, Learned from Peer: 62.115.42.144 (1299)
LOCAL_PREF: 70, MED: 48, ORIGIN: igp, Weight: 0, GROUP_BEST: 1
AS_PATH: 1299 32934
COMMUNITIES: 6939:2000 6939:7297 6939:8840 6939:9001
Last update to IP routing table: 2d3h2m25s
Entry cached for another 60 seconds.
So it would appear that the routes are in the Internet BGP tables for that first server… I’m going to guess that Facebook is in recovery mode and slowly restoring their network – assuming it’s not a DoS attack or something similar.
Let’s try a traceroute using ICMP packets, again we need to be mindful that some organizations will block all ICMP traffic to protect themselves against the miscredants and to better conceal their network topology.
[root@woodstock~]# traceroute -I 129.134.30.12 traceroute to 129.134.30.12 (129.134.30.12), 30 hops max, 60 byte packets 1 107.170.19.254 (107.170.19.254) 4.061 ms 4.040 ms 4.037 ms 2 138.197.248.154 (138.197.248.154) 1.545 ms 1.558 ms 1.558 ms 3 157.240.71.232 (157.240.71.232) 41.384 ms 41.345 ms 41.380 ms 4 157.240.42.70 (157.240.42.70) 1.893 ms 1.911 ms 1.913 ms 5 157.240.40.230 (157.240.40.230) 3.552 ms 3.529 ms 3.538 ms 6 129.134.47.188 (129.134.47.188) 8.797 ms 7.276 ms 7.229 ms 7 * * * 8 * * * 9 * * * 10 * * * 11 * * * 12 * * *
Ok, so we’re definitely reaching parts of the Facebook network, as 129.134.47.188 is on the same advertised network as a.ns.facebook.com (129.134.30.12).
Unfortunately that’s about as far as we can take it from here, we’ll need to wait for the news from Facebook itself.
Cheers!