The hardest program I've ever written (2015)

Posted by jacobedawson 6 days ago

The hardest program I've ever written (2015)(journal.stuffwithstuff.com)

96 points | 60 comments

georgeburdell 2 days ago|

Mine was an automated file transfer system that had to be 100% reliable on an insanely unreliable network (~95% uptime). Took about 9 months of bug squashing after development was done. So many edge cases. I would probably never mention this in a job interview because I doubt most people would understand why it was so hard.

bjoli 2 days ago||

I once wrote an inliner. When you have not done it, it seems simple. When you are doing it it is like trying to restrain a large rabid dog with a slippery leash.

Now, I am not a programmer by trade, but I have a hard time thinking anyone would find it nice to write an inliner. At least not if you want the inliner to always make things faster.

throwaway2037 1 day ago|||

    > an insanely unreliable network (~95% uptime)

This is wild! Can you explain more?

Did you ever blog about this program? It sounds very interesting, and there is no job interview on HN!

georgeburdell 1 day ago||

Lots of things are baked into that 95% number. Sometimes the power would go out. Data recipients were careless about checking whether their computers were powered on and had a static IP assigned, but I’d certainly hear about it if they didn’t get their data. Network engineers would futz with settings and do poorly announced infrastructure upgrades. Rather than try to fix all these issues, I just wrote the code assuming we were working from a 3rd world country.

greazy 2 days ago||

I'm working on this exact same thing. Was your code ever published or did you blog about it?

georgeburdell 1 day ago||

Nope, just some internal facing code. The challenges boiled down to tracking what parts of the data got successfully sent to which recipients and how to get proof the data were transmitted correctly

I wish I would have found a 3rd party tool to do all this, but I never did

adammarples 1 day ago||

Is this not basically just torrenting?

georgeburdell 15 hours ago||

I guess, but some of the recipients are just 3rd party automated systems with a standard set of instructions, so I have no control over the protocol used

PaulKeeble 2 days ago||

About the worst job on any enterprise software project is the PDF output, they always end up doing it for emails or something else and its a never ending list of bugs. Text formatting is a never ending list of problems since its so got a lot of vague inputs and a relatively strict output. Far too many little details go wrong.

vbezhenar 2 days ago||

With PDF, my best approach was to go very low level. I've used PDFKit and PDFBox libraries and both provide a way to output vector operations. It allows to implement extremely performant code. The resulting PDF is tiny and looks gorgeous (because it's vector). And you can implement anything. Code will be verbose, but it's worth it.

I even think that it's viable to output PDF without any libraries. I've investigated that format a bit and it doesn't seem too complicated, at least for relatively dumb output.

mikeday 2 days ago|||

We've spent twenty years working on HTML to PDF conversion and I expect we could easily spend another twenty years, so feel free to give Prince a try if you would rather avoid the headache :)

PaulKeeble 1 day ago|||

Normally when we are nearly there we say its 95% done and only 95% of the work remains. If your feeling is you are half done I suspect more than 50% of the work remains!

What I know having done a lot in this space is we aren't close!

mikeday 1 day ago||

yeah, we often refer to the first 90% of the work and the second 90% of the work lol.

twotwotwo 2 days ago|||

Awesome. From curiosity: is Prince's core still written in Mercury? (Looked at old comments.)

mikeday 1 day ago||

Absolutely! The CSS support, layout engine, PDF output, and JavaScript interpreter are all written in Mercury, while the font support that was originally a mix of Mercury and C has now been rewritten as a standalone Rust project, Allsorts.

kbbgl87 2 days ago|||

Thinking about phantomjs and rasterize brings back nightmares

huflungdung 2 days ago||

[dead]

sema4hacker 2 days ago||

> That handful of code took me almost a year to write.

Formatting can be tough. See Knuth's extensive bug list for TEX from 1987 at https://yurichev.com/mirrors/knuth1989.pdf to see the kind of tarpit one can get trapped in.

rafabulsing 2 days ago||

In my brief experiments with Flutter, I must admit I didn't enjoy the experience of using the autoformatting. Not knocking the author of the tool at all, I can definitely see how absurdly hard it is to create something that does what it is trying to do. And I'm not against autoformatting in general either. I think gofmt works much better, and that's in large part because it tries to do less.

With dart, I felt that very often, when time I saved a file after editing (which activated the formatter), the code would jump around a lot, even for very small edits sometimes. I actually found myself saving less often, as I found the sudden reorganizing kind of jarring and disorienting.

Most of this, I felt like came from this wish of making the formatter keep lines inside a given width. While that's a goal I appreciate, I've come to think is one of those things that's better done manually. The same goes for the use of whitespace in general, other than trivial stuff like consistent indentation. There's actual, important meaning that can be conveyed in how you arrange things which I think is more important than having it always be exactly mathematically consistent.

It's one of the reasons I still prefer ESLint over Prettier in JS land, also, even for stylistic rules. The fact that Prettier always rewrites the entire file from scratch with the parsed AST often ends up mangling some deliberate way that I'd arrange the code.

fn-mote 2 days ago|

One of the lessons I took from the formatters (Python, Go, Rust) is that enforcing the same style ends all of the drama - indeed, all of the thinking - about how to format code. I like that.

I run my formatters manually, so I can’t comment on the jumps in code. That does seem jarring.

rafabulsing 1 day ago||

Again, I don't entirely disagree. Some choices are entirely stylistic: single vs double quotes for strings. Tabs vs spaces. Indentation width. Trailing commas. Those (and others) encode literally 0 meaning, and any time discussing those is time wasted, so they should be auto formatted away.

But some things are not like that. Two statements being right against each other, or having an empty line between them, encodes information.

In a big function call with many arguments, where do you add line breaks between arguments? That can convey information as well. As the posted link says, those are some of the most difficult scenarios for a formatter to try to deal with, and my point is that I think it's not worth the effort.

kccqzy 1 day ago||

> Note that “best” is a property of the entire statement being formatted. A line break changes the indentation of the remainder of the statement, which in turn affects which other line breaks are needed.

I really don't like formatters where changing one small part of a large expression results in the entire expression formatted very differently. It's simply not version control friendly. Especially if the language encourages large statements like the pro example. I would rather accept a little bit of code ugliness in this case. Sure this then means that the way the code is formatted is path dependent (depends on the history of the code), but I think it's a reasonable compromise.

Panzerschrek 2 days ago||

For some time I tried to write a formatting tool for my programming language. After achieving first results I gave up. I have found that writing a formatter is surprisingly hard task. Operating on token-level can't provide good enough result, so a proper parser is necessary. Reusing existing parser isn't possible, since it ignores whitespaces and comments. Even more problems creates the necessity to preserve user-defined formatting in ambiguous cases.

So I thought, that there is no reason trying to solve all these problems, since it requires too much time investments. But without solving all this a semi-working formatter isn't good enough to be useful and not annoying.

skopje 1 day ago||

Its funny when you have programmed long enough to see the same language/app names recycled. Two examples that come to mind: Dart, which was an RTL validation scripting language in the 80's/90's used by CPU designers, and Elm, which was a mail program in Unix/Aix/SunOS in the 80's/90's. Even weirder, when googling for the "old" Dart, it referred to RTL as right-to-left, and not register transfer level.

kccqzy 1 day ago||

Most of the hardness of this program comes from a user-adjustable line length limit. The author on multiple occasions used a "blog-friendly 40-char line limit" which is of course insanely hard. The program can be made much much simpler if the line length is unlimited. And if that's not an option, the program can be made somewhat simpler by only allowing line limits that are reasonably large, such as 100 characters.

juangacovas 1 day ago||

IMAP email parser for a ticket-like program. Even with abstractions and libraries, email can be such a pita...

b4ckup 2 days ago|

I once wrote a formatter for powerquery that's still in use today. It's a much simpler language and I took a simpler approach. It was a really fun problem to solve.

More comments...