Times Have Changed. TCP Hasn't.
Transmission Control Protocol was designed and implemented decades ago. Today, the public network looks very different than it did back then. TCP is a brilliantly designed and implemented protocol that has served us well. However, its design principals from 35 years ago no longer apply.
TCP congestion control assumes that any error – e.g., a lost ACK – is caused by congestion. This allows TCP to be fair when sharing a critical resource. Decades ago, that made sense; today, it doesn’t. Back then, the critical resource was limited bandwidth in the core of ARPANET. Today, the core of the network is no longer a critical resource. It offers ludicrous capacity and speed. There is no need for TCP to be polite: bandwidth is plentiful.
What about the access portion of the network, that first mile / last mile technology that connects users to that ridiculously fast and capable core?
Let’s use DOCSIS 3.1 as an example. In a typical system, the Cable Modem Termination System transmits to homes in a single channel while homes transmit to the CMTS using a Time Division Multiple Access strategy on multiple channels. The key here is that you have your own dedicated upstream time slot. Your neighbor also has a dedicated time slot.
When your legacy TCP connection detects a lost packet, it assumes congestion and backs off in the name of fairness. That doesn't make much sense. To understand why, let’s go over three possibilities for the packet loss.
An error occurred in the core of the network. If this happens, the protocol should retransmit immediately. There is no need to wait or back off. There is no need to assume the problem occurred because of congestion. There is plenty of bandwidth in the core and there is no need to be fair.
The congestion or error occurred in the upstream direction from your house to the CMTS. The pinch point – where your congestion can occur – is limited to your upstream time slot. You may as well retransmit right away because you are only competing with yourself. An ideal solution would include QoS at the edge of your home to prioritize traffic. However, that is a different blog post.
The congestion or error occurred in the downstream direction from the CMTS to your house. In this case, you are sharing downstream bandwidth with your neighbors. Many multiservice providers implement logic in the CMTS to ensure you and your neighbors use your fair share. So again, the protocols you run should assume you are only competing with yourself and retransmit right away.
Over time, any protocol must determine the effective throughput rate. However, a modern design should not assume errors are due to congestion over a shared resource. Instead, the protocol should determine the effective throughput, quickly assuming any drop outside of the effective throughput is due to an error and not congestion.
In today's networking world, all too often you compete with yourself for bandwidth. It’s time the protocols you use optimize the bandwidth you purchase.