Akamai CDN and TCP Connections

998467_93055466-scale

In my latest adventure I had to untangle the interaction between a pair of Cisco ACE 4710s and Akamai’s Content Distribution Network (CDN) including SiteShield, Mointpoint, and SiteSpect. It’s truly amazing how complex and almost convoluted a CDN can make any website. Any when it fails you can guess who’s going to get the blame. Over the past few weeks I’ve been looking at a very interesting problem where an Internet facing VIP was experiencing a very unbalanced distribution across the real servers in the severfarm. I wrote a few quick and dirty Bash shell scripts to-do some repeated load tests utilizing curl and sure enough I was able to confirm that there was something amiss between the CDN and the LB. If I tested against the origin VIP I had near perfect round-robin load-balancing across the real servers in the VIP, if I tested against the CDN I would get very uneven load-balancing results.

When a web browser opens a connection to a web server it will generally send multiple requests across a single TCP connection similar to the figure below. Occasionally some browsers will even utilize HTTP pipelining if both the server and browser support that feature, sending multiple requests without waiting for the corresponding TCP acknowledgement.

HTTP Pipeline

The majority of load balancers, including the Cisco ACE 4710 and the A10 AX ADC/Thunder, will look at the first request in the TCP connection and apply the load-balancing metric and forward the traffic to a specific real server in the VIP. In order to speed the processing of future requests the load balancer will forward all traffic in that connection to the same real server in the VIP. This generally isn’t a problem if there’s only a single user associated with a TCP connection.

HTTP Pipeline Servers

Akamai will attempt to optimize the number of TCP connections from their edge servers to your origin web servers by sending multiple requests from different users all over the same TCP connection. In the example below there are requests from three different users but it’s been my experience that you could see requests for dozens or even hundreds of users across the same TCP connection.

HTTP Pipeline with Akamai

And here lies the problem, the load balancer will only evaluate the first request in the TCP connection, all subsequent requests will be sent to the same real server leaving some servers over utilized and others under utilized.

 

HTTP Pipeline with Akamai Servers Single

Thankfully there are configuration options in the majority of load balancers to work around this problem and instruct the load balancer to evaluate all requests in the TCP connection independently.

A10 AX ADC/Thunder

strict-transaction-switch

Cisco ACE 4710

parameter-map type http HTTP_PARAMETER_MAP
  persistence-rebalance strict

With the configuration change made now every request in the TCP connection is evaluated and load-balanced independently resulting in a more even distribution across the real servers in the farm.

HTTP Pipeline with Akamai Severs

In this scenario I’m using HTTP cookies to provide session persistence and ‘stickiness’ for the user sessions. If your application is stateless then you don’t really need to worry that a user lands on the same real server for each and every request.

Cheers!

Image Credit: topfer

{ 0 comments }

Web Application Load Testing – TCP Port Exhaustion

914288_69337190-scale

I recently ran into an puzzling issue with a web framework that was failing to perform under a load test. This web framework was being front-ended by a pair of Cisco ACE 4710 Application Control Engine (Load-Balancer) using a single IP address in a SNAT pool. The Cisco ACE 4710 was the initial suspect, but a quick analysis determined that we were potentially experiencing a TCP port exhaustion issue because the test would start failing almost at the same point every time. While the original suspect was the Cisco ACE 4710 it turned out to be a TCP port exhaustion […] Read More

{ 2 comments }

Response: Scripting Does Not Scale For Network Automation

screencapture-etherealmind-com-scripting-scale-network-automation

About three weeks ago Greg Ferro from Etherealmind posted an article entitled "Scripting Does Not Scale For Network Automation". It's quite clear from reading the article that Greg really is "bitter and jaded".  While I agree that there are challenges in scripting they also come with some large rewards for those that are able to master the skill. In a subsequent comment Greg really hits on his point.. "We need APIs for device consistency, frameworks for validation and common actions. But above that we need platforms that solve big problems - scripting can only solve little problems. " I agree […] Read More

{ 1 comment }

Your customer needs help? Tell them to hire me!

831838_16000623-scale

This is a little off-topic but I've probably let this slide for too long and unfortunately I've been going around with this bent up anger for quite sometime now and it's time to vent and rant. I provide a blog and forum to the community as a way to help educate people and hopefully learn a little something myself along the way. I'm generally interested in targeting the actual end-user, the network engineer or system administrator that's working for Acme Corp. or Wayne Enterprises or the Umbrella Corp, hopefully you get the idea. Inevitably there will be a reseller or […] Read More

{ 1 comment }

CrashPlan filling up your SSD?

crashplan

Over the weekend I actually had some downtime and was hoping to play a little Planetside 2 until I noticed that my Windows 7 desktop was down to only 8GB of free space on my 256GB SSD.  A quick check with WinDirStat found that I had over 133GB of files in C:\ProgramData\CrashPlan, even though I had installed the software into D:\Program Files (x86)\CrashPlan. I've been testing CrashPlan for the past 30 days trying to decide if it was the best tool available for me to use in backing up the numerous desktops and laptops throughout the house. I had been […] Read More

{ 0 comments }