Top
Best
New

Posted by eieio 11 hours ago

It's Always TCP_NODELAY(brooker.co.za)
244 points | 67 commentspage 2
kazinator 10 hours ago|
> The bigger problem is that TCP_QUICKACK doesn’t fix the fundamental problem of the kernel hanging on to data longer than my program wants it to.

Well, of course not; it tries to reduce the problem of your kernel hanging on to an ack (or genearting an ack) longer than you would like. That pertains to received data. If the remote end is sending you data, and is paused due to filling its buffers due to not getting an ack from you, it behooves you to send an ack ASAP.

The original Berkeley Unix implementation of TCP/IP, I seem to recall, had a single global 500 ms timer for sending out acks. So when your TCP connection received new data eligible for acking, it could be as long as 500 ms before the ack was sent. If we reframe that in modern realities, we can imagine every other delay is negligible, and data is coming at the line rate of a multi gigabit connection, 500 ms represents a lot of unacknowledged bits.

Delayed acks are similar to Nagle in spirit in that they promote coalescing at the possible cost of performance. Under the assumption that the TCP connection is bidirectional and "chatty" (so that even when the bulk of the data transfer is happening in one direction, there are application-level messages in the other direction) the delayed ack creates opportunities for the TCP ACK to be piggy backed on a data transfer. A TCP segment carrying no data, only an ACK, is prevented.

As far as portability of TCP_QUICKACK goes, in C code it is as simple as #ifdef TCP_QUICKACK. If the constant exists, use it. Otherwise out of luck. If you're in another language, you have to to through some hoops depending on whether the network-related run time exposes nonportable options in a way you can test, or whether you are on your own.

joelthelion 3 hours ago||
What happens when you change the default when building a Linux distro? Did anyone try it?
saghm 9 hours ago||
I first ran into this years ago after working on a database client library as an intern. Having not heard of this option beforehand, I didn't think to enable it in the connections the library opened, and in practice that often led to messages in the wire protocol being entirely ready for sending without actually getting sent immediately. I only found out about it later when someone using it investigated why the latency was much higher than they expected, and I guess either they had run into this before or were able to figure out that it might be the culprit, and it turned out that pretty much all of the existing clients in other languages set NODELAY unconditionally.
hathawsh 8 hours ago||
Ha ha, well that's a relief. I thought the article was going to say that enabling TCP_NODELAY is causing problems in distributed systems. I am one of those people who just turn on TCP_NODELAY and never look back because it solves problems instantly and the downsides seem minimal. Fortunately, the article is on my side. Just enable TCP_NODELAY if you think it's a good idea. It apparently doesn't break anything in general.
foltik 8 hours ago||
Why doesn’t linux just add a kconfig that enables TCP_NODELAY system wide? It could be enabled by default on modern distros.
indigodaddy 7 hours ago|
Looks like there is a sysctl option for BSD/MacOS but Linux it must be done at application level?
hsn915 8 hours ago||
Wouldn't distributed systems benefit from using UDP instead of TCP?
0xbadcafebee 7 hours ago|
Only if you're sending data you don't mind losing and getting out of order
neomantra 4 hours ago||
This is true for simple UDP, but reliable transports are often built over UDP.

As with anything in computing, there are trade-offs between the approaches. One example is QUIC now widespread in browsers.

MoldUDP64 is used by various exchanges (that's NASDAQ's name, others do something close). It's a simple UDP protocol with sequence numbers; works great on quality networks with well-tuned receivers (or FPGAs). This is an old-school blog article about the earlier MoldUDP:

https://www.fragmentationneeded.net/2012/01/dispatches-from-...

Another is Aeron.io, which is a high-performance messaging system that includes a reliable unicast/multicast transport. There is so much cool stuff in this project and it is useful to study. I saw this deep-dive into the Aeron reliable multicast protocol live and it is quite good, albeit behind a sign-up.

https://aeron.io/other/handling-data-loss-with-aeron/

nospice 2 hours ago||
Strictly speaking, you can put any protocol on top of UDP, including a copy of TCP...

But I took parent's question as "should I be using UDP sockets instead of TCP sockets". Once you invent your new protocol instead of UDP or on top of it, you can have any features you want.

rowanG077 9 hours ago||
I fondly remember a simple simulation project we had to do with a group of 5 students in a second year class which had a simulation and some kind of scheduler which communicated via TCP. I was appalled at the perfomance we were getting. Even on the same machine it was way too slow for what it was doing. After hours of debugging in turned out it was indeed Nagle's algorithm causing the slowness, which I never heard about at the time. Fixed instantly with TCP_NODELAY. It was one of the first times it was made abundantly clear to me the teachers at that institution didn't know what they were teaching. Apparently we were the only group that had noticed the slow performance, and the teachers had never even heard of TCP_NODELAY.
jonstewart 9 hours ago||
<waits for animats to show up>
Animats 7 hours ago||
OK, I suppose I should say something. I've already written on this before, and that was linked above.

You never want TCP_NODELAY off at the sending end, and delayed ACKs on at the receiving end. But there's no way to set that from one end. Hence the problem.

Is TCP_NODELAY off still necessary? Try sending one-byte TCP sends in a tight loop and see what it does to other traffic on the same path, for, say, a cellular link. Today's links may be able to tolerate the 40x extra traffic. It was originally put in as a protection device against badly behaved senders.

A delayed ACK should be thought of as a bet on the behavior of the listening application. If the listening application usually responds fast, within the ACK delay interval, the delayed ACK is coalesced into the reply and you save a packet. If the listening application does not respond immediately, a delayed ACK has to actually be sent, and nothing was gained by delaying it. It would be useful for TCP implementations to tally, for each socket, the number of delayed ACKs actually sent vs. the number coalesced. If many delayed ACKs are being sent, ACK delay should be turned off, rather than repeating a losing bet.

This should have been fixed forty years ago. But I was out of networking by the time this conflict appeared. I worked for an aerospace company, and they wanted to move all networking work from Palo Alto to Colorado Springs, Colorado. Colorado Springs was building a router based on the Zilog Z8000, purely for military applications. That turned out to be a dead end. The other people in networking in Palo Alto went off to form a startup to make a "PC LAN" (a forgotten 1980s concept), and for about six months, they led that industry. I ended up leaving and doing things for Autodesk, which worked out well.

e40 9 hours ago|||
Came here thinking the same thing...
armitron 9 hours ago||
Assuming he shows up, he'll still probably be trying to defend the indefensible..

Disabling Nagle's algorithm should be done as a matter of principle, there's simply no modern network configuration where it's beneficial.

mmaunder 5 hours ago|
PSA: UDP exists.