I was recently having a conversation with a DevOps colleague (let’s not jeer too loudly) who was trying to understand why he wasn’t getting more than 350Mbps between two servers over a 1Gbps WAN connection. He thought there must be a problem with the network and suggested that I should open a ticket with the carrier to “fix” the issue. I attempted to explain to him that it was the latency and distance between the two servers (3,000 miles) that was limiting the TCP performance and he could potentially overcome that issue by using multiple TCP sockets with larger TCP window sizes, or potentially switch to UDP instead of TCP.
I used iPerf3 to demonstrate the issue… with a single stream/thread we were able to achieve ~ 350Mbps. With a second stream/thread we were able to hit ~ 600Mbps. With a third stream/thread we were able to hit ~ 789Mbps.
It wasn’t magic…. it’s the well known fact that latency plays a huge role in TCP performance. In order to understand why it impacts TCP performance you need to understand how TCP works. TCP requires that transmitted data sets are acknowledged before the next set of data can be transmitted. The TCP window size determines the size of those data sets, larger TCP window size allows more data to be transmitted before an acknowledgement is required. The delay in getting the acknowledgement back is what limits the performance.
There is a well written blog article from Netbeez written by Stefano Gridelli titled, Impact of Packet Loss and Round-Trip Time on Throughput that covers this topic in great detail. You can even apply a mathematical formula to determine the max potential throughput given a known RTT latency.