Posted by jacobedawson 10/29/2025
We needed to do a nightly transfer of data. We had a variable amount of data to transfer, but typically in the range of one to two TB. We had a 1GBit link between the data centres housing the two systems, but it wasn't an exclusive link - backups and other stuff would be running during the night as well, so if we hog all bandwidth we'd have to deal with unhappy people. Hard deadline for the transfer was start of the work day.
Now the data does compress easily - but the data is only available for compression at the beginning of our sync window. We definitely need to compress some of the data to sync everything in time, and keep other users of the line happy. But: If we spend too much time on the compressing we might not have enough time left to send the data, plus we're not alone on the systems - other people will be unhappy about their nightly jobs failing if we hog all the available CPU time.
So we needed to find the right balance of data compression and bandwidth utilisation, taking into account all those factors, to make things work in the amount of time we had available.
Thanks to AMD nowadays we'd just throw more CPUs at the problem, but back then the 8 CPU server we were using was already quite expensive.
Now, I am not a programmer by trade, but I have a hard time thinking anyone would find it nice to write an inliner. At least not if you want the inliner to always make things faster.
> an insanely unreliable network (~95% uptime)
This is wild! Can you explain more?Did you ever blog about this program? It sounds very interesting, and there is no job interview on HN!
I wish I would have found a 3rd party tool to do all this, but I never did
I even think that it's viable to output PDF without any libraries. I've investigated that format a bit and it doesn't seem too complicated, at least for relatively dumb output.
What I know having done a lot in this space is we aren't close!
Formatting can be tough. See Knuth's extensive bug list for TEX from 1987 at https://yurichev.com/mirrors/knuth1989.pdf to see the kind of tarpit one can get trapped in.
With dart, I felt that very often, when time I saved a file after editing (which activated the formatter), the code would jump around a lot, even for very small edits sometimes. I actually found myself saving less often, as I found the sudden reorganizing kind of jarring and disorienting.
Most of this, I felt like came from this wish of making the formatter keep lines inside a given width. While that's a goal I appreciate, I've come to think is one of those things that's better done manually. The same goes for the use of whitespace in general, other than trivial stuff like consistent indentation. There's actual, important meaning that can be conveyed in how you arrange things which I think is more important than having it always be exactly mathematically consistent.
It's one of the reasons I still prefer ESLint over Prettier in JS land, also, even for stylistic rules. The fact that Prettier always rewrites the entire file from scratch with the parsed AST often ends up mangling some deliberate way that I'd arrange the code.
I run my formatters manually, so I can’t comment on the jumps in code. That does seem jarring.
But some things are not like that. Two statements being right against each other, or having an empty line between them, encodes information.
In a big function call with many arguments, where do you add line breaks between arguments? That can convey information as well. As the posted link says, those are some of the most difficult scenarios for a formatter to try to deal with, and my point is that I think it's not worth the effort.
I really don't like formatters where changing one small part of a large expression results in the entire expression formatted very differently. It's simply not version control friendly. Especially if the language encourages large statements like the pro example. I would rather accept a little bit of code ugliness in this case. Sure this then means that the way the code is formatted is path dependent (depends on the history of the code), but I think it's a reasonable compromise.
So I thought, that there is no reason trying to solve all these problems, since it requires too much time investments. But without solving all this a semi-working formatter isn't good enough to be useful and not annoying.