-2000 Lines of code (2004)

Posted by xeonmc 5 days ago

-2000 Lines of code (2004)(www.folklore.org)

538 points | 245 comments

bironran 5 days ago|

One of my best commits was removing about 60K lines of code, a whole "server" (it was early 2000's) with that had to hold all of its state in memory and replacing them with about 5k of logic that was lightweight enough to piggyback into another service and had no in-memory state at all. That was pure a algorithmic win - figuring out that a specific guided subgraph isomorphism where the target was a tree (directed, non cyclic graph with a single root) was possible by a single walk through the origin (general) directed bi-graph while emitting vertices and edges to the output graph (tree) and maintaining only a small in-process peek-able stack of steps taken from the root that can affect the current generation step (not necessarily just parent path).

I still remember the behemoth of a commit that was "-60,000 (or similar) lines of code". Best commit I ever pushed.

Those were fun times. Hadn't done anything algorithmically impressive since.

ifellover 4 days ago||

I’m a hobby programmer and lucky enough to script a lot of things at work. I consider myself fairly adept at some parts of programming, but comments like these make it so clear to me that I have an absolutely massive universe of unknowns that I’m not sure I have enough of a lifetime left to learn about.

Cthulhu_ 4 days ago|||

I want to believe a lot of these algorithms will "come to you" if you're ever in a similar situation; only later will you learn that they have a name, or there's books written about it, etc.

But a lot is opportunity. Like, I had the opportunity to work on an old PHP backend, 500ms - 1 second response times (thanks in part to it writing everything to a giant XML string which was then parsed and converted to a JSON blob before being sent back over the line). Simply rewriting it in naive / best practices Go changed response times to 10 ms. In hindsight the project was far too big to rewrite on my own and I should have spent six months to a year trying to optimize and refactor it, but, hindsight.

nomel 5 hours ago|||

This is my experience, and my favorite way to learn: go in blindly, look things up when you get stuck/run out of ideas. I think it forces a deeper understanding of the topic, and really helps things "stick". I assume it's related to the massive dumps of dopamine that come from the "Eureka!" moments.

I've "invented" all sorts of old patents, all sorts of algorithms, including the PID algorithm. I think it helped form a very practically useful "intuition".

But, I've noticed that some people are passionately against this type of self exploration.

geon 4 days ago|||

Yes. I invented the Trie data structure when I was 19. It was very exciting finding out it had a name, and it was indeed considered a good fit for my use case.

tired-chimp 4 days ago||

Thats so funny, I had the exact same experience. And when I was 16 I "invented" csv's because I was too lazy to setup SQL for my discord bot. I like to think I've gotten better at searching for the correct solution to things rather than just jumping in with my best guess.

bhaak 4 days ago||

LLMs are pretty good at providing names and search terms for very vague prompts.

Although that's also often an invitation for hallucinations so you have to be even more careful than usual.

mikepurvis 4 days ago||

I was just going to say the same— LLMs are great for giving a name to a described concept, architecture, or phenomenon. And I would argue that hallucinations don't actually much matter for this usage as you're going to turn around and google the name anyway, once you've been told it.

PaulRobinson 4 days ago||||

Read some good books on data structures and algorithms, and you'll be catching up with this sort of comment in no time. And then realise there will always be a universe of unknowns to you. :-) Good luck, and keep going.

fuzztester 4 days ago|||

zen comment :)

uncatchable, so I won't even try.

rangerelf 4 days ago|||

The more you know, the more you know you don't know.

HenryBemis 4 days ago|||

do try (so you get the joy of 'small' wins), also do know that it's untouchable (so you don't despair when you don't master quantum mechanics in one lifetime)

fuzztester 3 days ago||

oh, i only meant that rhetorically.

no worries.

rizky05 4 days ago|||

[dead]

amake 4 days ago||||

(More than?) half of the difficulty comes from the vocabulary. It’s very much a shibboleth—learn to talk the talk and people will assume you are a genius who walks the walk.

bironran 4 days ago||

That! It took me a while to start. My education of graph theory wasn't much better than your average college grad. But I found that fascinating and started reading. I was also very lucky to have had two great mentors - my TL and the product's architect, the former helped me to expend my understanding of the field.

weaksauce 4 days ago||||

it’s just graph theory nomenclature. if you study an intro to graph algorithms it would get you most of the way there.

meistertigran 4 days ago||||

A lot if it is just technical jargon. Which doesn't mean it's bad, one has to have a way to talk about things, but the underlying logic, I've found, is usually graspable for most people.

It's the difference between hearing a lecture from a "bad" professor in Uni and watching a lecture video by Feynman, where he tries to get rid of scientific terms, when explaining things in simple terms to the public.

As long as you get a definition for your terms, things are manageable.

dev0p 4 days ago||||

I've been coding for a living for 10 years and that comment threw me for a loop as well. Gotta get to studying some graph theory I guess?

neilv 4 days ago|||

You could've figured out this one with basic familiarity with how graphs are represented, constructed, and navigated, and just working through it.

One way to often arrive at it is to just draw some graphs, on paper/whiteboard, and manually step through examples, pointing with your finger/pen, drawing changes, and sometimes drawing a table. You'll get a better idea of what has to happen, and what the opportunities are.

This sounds "Then draw the rest of the owl," but it can work, once you get immersed.

Then code it up. And when you spot a clever opportunity, and find the right language to document your solution, it can sound like a brilliant insight that you could just pull out of the air, because you are so knowledgeable and smart in general. When you actually had to work through that specific problem, to the point you understood it, like Feynman would want you to.

I think Feynman would tell us to work through problems. And that Feynman would really f-ing hate Leetcode performance art interviews (like he was dismayed when he found students who'd rote-memorize the things to say). Don't let Leetcode asshattery make you think you're "not good at" algorithms.

marksbrown 2 days ago|||

Funnily enough, a common idea! What would Feynman do? | Fabulous adventures in coding https://share.google/iSEhAqL9NhAstSlRE

bironran 4 days ago|||

I despise leetcode interviews. These days, with coding LLMs, I see them as even less relevant than they were before.

Yet, you ask someone "how do you build an efficient LFU" and get blank stares (I just LOVE the memcache solution of regions and probabilistic promotion/demotion).

sensanaty 4 days ago|||

I guess you're the reason we get asked all those "Invert a binary tree" type questions these days!

Jokes aside, could I get a layman's explanation of the graph theory stuff here? Sounds pretty cool but the terminology escapes me

ninetyninenine 4 days ago|||

I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.

It’s because every task was doing a database call but they had a whole repo and aws lambdas for running it. Stupidest thing I’ve ever seen.

motorest 3 days ago||

> I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.

Your example raises some serious red flags. Did it ever dawned upon you that the reason these background tasks were offloaded to a dedicated service might have been to shed this load from your main server and protect it from handling sudden peaks in demand?

ninetyninenine 3 days ago||

There’s no red flag.

These background tasks are all database calls. That means the cpu is just waiting on the database for the majority of the call. Most modern servers can handle 10k of these calls concurrently. And you can do this off of one not so powerful CPU. Even half a cpu can handle this. Of course it depends on the CPU but you get my point.

The database is the bottleneck. The database is the thing that needs to be scaled first before you scale servers. This is the most common web application pattern. One way is providing more compute to the database (sharding is better then increasing cpu power as the bottleneck in the database is usually filesystem access not cpu power). Another way is to have a queue buffer the traffic spikes. Both of these are addressing an issue with the database first.

In most web apps. All the server does is wait for a database. The database is doing compute. You never want the server to do compute as that becomes what we call a “blocking call.” These blocking calls are the ones you offload to an external service as these calls “block” entire cpu threads. database calls do not “block” as the server will context switch to another green thread during database calls.

If you work somewhere where you’re scaling crud servers but not after scaling a central database it usually means you’re in a company that doesn’t get it and overemphasizes on “architecture” over common sense. It’s actually extremely common in lower tier small companies to have not so smart people build things like this that don’t make any sense. They aren’t thinking coherently and I’ve seen tons of people who just miss this common sense notion.

I’ll be Frank. It’s stupid and defies common sense. It’s likely you are doing this? But it’s also extremely commonplace.

ninetyninenine 4 days ago|||

Am I mistaken? Is what you say even possible?

Given two graphs one is a tree you cannot determine if the tree is a subgraph of the other graph in one walk through?

It’s only possible if you’re given additional information? Like a starting node to search from? I’m genuinely confused?

jcynix 4 days ago|||

Take a look at Carl Hewitt's Same-Fringe solution, which flattens structures concurrently and compares the final (aka leave) nodes:

http://www.nsl.com/papers/samefringe.htm

If you flatten both of your trees/graphs and regard the output as strings of nodes, you reduce your task to a substring search.

Now if you want to verify if the structures and not just the leave nodes are identical, you might be able to encode structure information into you strings.

ninetyninenine 4 days ago||

Thanks. Good solution.

I was thinking in terms of finding all subgraph isomorphisms. But this definitely is O(N) if all you need is one solution.

But then I thought about it even further and this reduces to sliding window problem. In this case you still need to travel to each node in the window to see if there’s a match.

So it cannot be that you traverse each node once. Not if you want to find all possible subgraph isomorphisms.

Imagine a string that is a fractal of substrings:

     rrrrrrrrrrrrrrrrrrrrrrrrrrrr

And the other one:

     rrrrrrr

Right? The sliding window for rrrrrrr will be 7 in length and you need to traverse that entire window every time you move it. So by that fact alone every node is traversed at least 7 times.

bironran 4 days ago|||

I oversimplified. See https://news.ycombinator.com/item?id=44390701

ccppurcell 4 days ago|||

Hi I'm a mathematician with a background in graph theory and algorithms. I'm trying to find a job outside academia. Can you elaborate on the kind of work you were doing? Sounds like I could fruitfully apply my skills to something like that. Cheers!

hershey890 4 days ago|||

Look into quantitative analyst roles at finance firms if you’re that smart.

There’s also a role called being an algorithms engineer in standard tech companies (typically for lower level work like networking, embedded systems, graphics, or embedded systems) but the lack of an engineering background may hamstring you there. Engineers working in crypto also use a fair bit of algorithms knowledge.

I do low level work at a top company, and you only use algorithms knowledge on the job a couple of times a year at best.

fuzztester 4 days ago||||

You can try to get a job at an investment bank, if you're okay with heavy slogging, i.e., in terms of hours, which I have heard is the case, although that could be wrong.

I heard from someone who was in that field, that the main qualification for such a job is analytical ability and mathematics knowledge, apart from programming skills, of course.

bironran 4 days ago|||

That was about 20 years ago. Not much translates to today's world. I was in the algorithms team working on a CMDB product. Great tech, terrible marketing.

These days it's very different, mostly large-ish distributed systems.

ccppurcell 3 days ago||

Thanks for replying anyway!

chamomeal 4 days ago|||

I would love a little more context on this, cause it sounds super interesting and I also have zero clue what you’re talking about. But translating a stateful program into a stateless one sounds like absolute magic that I would love to know about

ninetyninenine 4 days ago||

He has two graphs. He wants to determine if one graph is a subset of another graph.

The graph that is to be determined as a subset is a tree. From there he says it can be done in an algorithm that only traverses every node at most one time.

I’m assuming he’s also given a starting node in the original graph and the algorithm just traverses both graphs at the same time starting from the given start node in the original graph and the root in the tree to see if they match? Standard DFS or BFS works here.

I may be mistaken. Because I don’t see any other way to do it in one walk through unless you are given a starting node in the original graph but I could be mistaken.

To your other point, The algorithm inherently has to also be statefull. All traversal algorithms for graphs have to have long term state. Simply because if your at a node in a graph and it has like 40 paths to other places you can literally only go down one path at a time and you have to statefully remember that node has another 39 paths that you have to come back to later.

bironran 4 days ago||

kindaaaa....

I oversimplified the problem :). Really it was about generating an isomporhic-ish view, based on some user defined rules, of an existing graph, itself generated by a subgraph isomorphism by a query language.

Think a computer network as a graph, with various other configuration items like processes, attached drives, etc (something also known as a CMDB). Query that graph to generate a subgraph out of it. Then use rules to make that subgraph appear as a tree of layers (tree but in each layer you may have additional edges between the vertices) because trees are efficient, non-complex representation on 2d space (i.e. monitors).

However, a child node in that tree isn't necessarily connected directly to the parent node. E.g. one of the rules may be "display the sub network and the attached drives in a single layer", so now the parent node, the gateway, has both network nodes (directly connected to it) and attached drives (indirectly connected to it) as direct descendants.

Extend this to be able to connect through any descendant, direct or indirect (gateway -> network node -> disk -> config file -> config value - but put the config value on the level of the network node and build a link between them to represent the compound relationship).

Walk through the original subgraph while evaluating the rules and build a "trace back" stack to let you understand how to build each layer even in the presence of compound links while performing a single walkthrough instead of nm (original vertices rules for generation).

As I said, that was a lot of fun. I miss those days.

ninetyninenine 4 days ago|||

The target being a tree is irrelevant right? It’s the “guided” part that makes a single walk through possible?

You are starting at a specific node in the graph and saying that if there’s an isomorphism the target tree root node must be equivalent to that specific starting node in the original graph.

You just walk through the original graph following the pattern of the target tree and if something doesn’t match it’s false otherwise true? Am I mistaken here? Again the target being a tree is a bit irrelevant. This will work for any subgraph as long as as you are also given starting point nodes for both the target and the original graph?

bironran 4 days ago||

I oversimplified. See https://news.ycombinator.com/item?id=44390701

bravesoul2 5 days ago|||

Nice when you turn an entire server into a library/executable.

fuzztester 4 days ago|||

>Those were fun times. Hadn't done anything algorithmically impressive since.

the select-a-bunch-of-code-and-then-zap-it-with-the-Del-key is the best hardware algorithm.

ddejohn 5 days ago|||

Sounds interesting. Have you written about it in more detail somewhere?

bironran 4 days ago||

See https://news.ycombinator.com/item?id=44390701

bbkane 5 days ago|||

What did the software product do?

bironran 4 days ago||

The product was a CMDB, with great tech and terrible marketing.

antihero 4 days ago|||

I'm sure with impending tide of slop-code, we'll have many more things to delete in our lifetimes.

b0a04gl 5 days ago||

[flagged]

pech0rin 4 days ago|||

I'm sick and tired of all these AI generated comments. Oh you got the AI to use lower case! Wow it still writes the exact same way.

cb5r 4 days ago|||

I advise checking out the users other comments before jumping to conclusions. Doesn't look AI generated to me, rather just an "individual" writing style. Only because it's possible doesn't mean its true. Maybe user can confirm?

lukan 4 days ago||||

Hm. Not convinced. What makes you so sure?

Otherwise just downvote or flag I guess, but this comment of yours just reads as an insult to a person that maybe did not put the most effort into writing their comment, but seems genuine to me at least.

JdeBP 4 days ago|||

The now removed stuff, in the original, talking about a blue whale was somewhat odd.

lukan 4 days ago||

Ok, if there was more and weird stuff, that now got edited out(after being called out?), that would be a different story.

generalizations 4 days ago|||

sounds like the “eigenprompt”

cess11 4 days ago|||

On a medium sized system that isn't young and fresh deleting 60 KLOC is highly unlikely to reflect a "system rethink".

Is this, from elsewhere in the thread, a system rethink, https://github.com/dotnet/runtime/pull/36715/files ?

I've worked on a product that reinvented parts of the standard library in confusing and unexpected ways, meaning that a lot of the code could easily be compacted 10-50 times in many place, i.e. 20-50 lines could be turned into 1-5 or so. I argued for doing this and deleting a lot of the code base, which didn't take hold before me and every other dev left except one. Nine months after that they had deleted half the code base out of necessity, roughly 2 MLOC to 1 MLOC, because most of it wasn't actually used much by the customers and the lone developer just couldn't manage the mess on his own.

I wouldn't call that a system rethink.

jfengel 5 days ago||

In college I worked for a company whose goal was to prove that their management techniques could get a bunch of freshman to write quality code.

They couldn't. I would go find the code that caused a bug, fix it and discover that the bug was still there. Because previous students had, rather than add a parameter to a function, would make a copy and slightly modify it.

I deleted about 3/4 of their code base (thousands of lines of Turbo Pascal) that fall.

Bonus: the customer was the Department of Energy, and the program managed nuclear material inventory. Sleep tight.

uticus 5 days ago||

> make a copy and slightly modify it

In addition to not breaking existing code, also has added benefit of boosting personal contribution metrics in eyes of management. Oh and it's really easy to revert things - all I have to do is find the latest copy and delete it. It'll work great, promise.

0cf8612b2e1e 5 days ago|||

I mean…when you have a pile of spaghetti, there is only so much you can do.

travisgriggs 5 days ago|||

Ask for more staff, reorganize the team into a set of multiple teams, and hire more middle management! Win win for the manager.

8n4vidtmkvmk 5 days ago||

Add tests to the function as it exists today. Submit. Add new functionality, make sure tests still pass. Done. Updating a function here and there shouldn't require more staff.

SkyBelow 4 days ago||

This implies adding tests that accurately capture all the nuances of the function and don't test the simplest logic need to hit code coverage. When we are talking someone new to the function, then this is about the same as asking them to learn the function so they can be sure they didn't make an error when they changed it. The benefit of tests is that they are written by the person creating the function originally who is most aware of the hidden dangers of it.

I'm distrustful on unit testing as I've seen too many tests written to make code coverage numbers but that don't actually test the functions they are aimed at. A non-trivial number which run the function asynchronously and then report a successful run before the function even finishes executing, meaning that even throwing errors don't fail the tests (granted, part of that is on the testing framework for letting unexpected errors ever result in a pass).

hdjrudni 4 days ago|||

We have a saying at my work. "If you like it, then you should have put a test on it". If the original author didn't add adequate coverage and you end up breaking them, it's on them.

dml2135 4 days ago|||

Of course, this is the way you need to write tests -- to test the actual logical pathways and requirements of the code, and not just finagle them together to overfit some code coverage metric.

sumtechguy 4 days ago||||

Add some meat sauce and more spaghetti :)

mrweasel 4 days ago|||

Spaghetti piles are where you can do the most... if you're brave enough and have agency to do so.

nico 5 days ago|||

Immutable functions! I guess that’s one way of doing functional programming /s

kevincox 3 days ago|||

Reminds me of https://www.unison-lang.org/docs/the-big-idea/

Sharlin 4 days ago||||

In a (very real) sense, git is an immutable data structure of immutable snapshots of code.

ThunderSizzle 4 days ago||

You can do commit squashing in git, right? I know HG history editing wa much more of a pain than it seems to be in git.

Sharlin 3 days ago||

Yes, but it just creates a new immutable branch in the commit graph. All the old commits are still there, but if they're not reachable from the root refs, they'll get GC'd eventually. The only mutable parts are HEAD, branch/tag names etc that can be changed to point to whatever. Anything that has a hash is necessarily immutable, because changing it in any way (including changing its parent pointer(s)) changes the hash.

windward 4 days ago|||

pfft, that's just symbol versioning

al_borland 5 days ago|||

I work with someone who has a habit of code duplication like this. Typically it’s an effort to turn around something quickly for someone who is demanding and loud. Refactoring the shared function to support the end edge case would take more time and testing, so he doesn’t do it. This is a symptom of the core problem.

8n4vidtmkvmk 5 days ago|||

I've been getting stricter about not letting that stuff into the codebase. They always say they'll clean it up later but they never do.

Sharlin 4 days ago|||

To paraphrase a Python saying, “master is where bad code goes to die”.

motorest 3 days ago|||

> They always say they'll clean it up later but they never do.

Are you sure there's anything needing cleaning up?

hdjrudni 3 days ago||

The duplicated code that needs updating in 50 places every time a bug or new feature comes in? Yes, I'm sure.

motorest 2 days ago||

> The duplicated code that needs updating in 50 places every time a bug or new feature comes in? Yes, I'm sure.

If you're talking about duplicate code showing up in 50 places then your problem is not code duplication but incompetent developers not being able to maintain a project.

If instead you're talking about code with a passing resemblance showing up in 2 or 3 places then odds are you're actually looking at more maintainable code straight in the eye and you're not able to understand how that makes the project more maintainable.

akdor1154 4 days ago||||

I have a habit of doing this for data processing code (python, polars).

For other code it's an absolute stink and i agree. But for data transforms... I've seen the alternative, a neatly abstracted in-house library of abstracted combinations of dataframe operations with different parameters and.. It's the most aesthetically pleasing unfathomable hell I've ever experienced.

So now, when munging dataframes, i will be much faster to reach for 'copy that function and modify it slightly' - maintenance headache, but at least the result is readable.

Cthulhu_ 4 days ago||||

But it's a false premise; the claim is that just copy/pasting something is faster, but is it really?

The demanding / loud person can and should be ignored; as a developer, you are responsible for code quality and maintainability, not your / their manager.

al_borland 4 days ago||

I agree. I always take the time to clean things up along the way, but short term thinning is often incentivized and rewarded.

motorest 3 days ago|||

> I work with someone who has a habit of code duplication like this.

Are you sure it's code duplication?

I mean, read your own description: the new function does not need to support edge cases. Having to handle edge cases is a huge code smell, and a clear sign of premature generalization.

And you even admit the guy was more productive and added less bugs?

There is a reason why the mistakes caused by naive approaches to Don't Repeat Yourself (DRY) are corrected with Write Everything Twice (WET).

al_borland 3 days ago||

I didn’t say less bugs. There are a lot of bugs, they are just localized to each call, and then copy/pasted all over the place. So when found, they need to be fixed in a bunch of places. It makes for quite the mess.

They just aren’t making changes to the shared function, so they don’t need to test existing functionality still works, just their single use case.

anticodon 4 days ago|||

This reminds me of my experience. I've worked for one company based in SEA that had almost identical portals in several countries in the region. Portals were developed by an Australian company and I was hired to maintain existing/develop new portals.

Source code for each portal was stored in a separate Git repository. I've asked the original authors how am I supposed to fix bugs that affect all the portals or develop new functionality for all the portals. The answer was to backport all fixes manually to all copies of the source code.

Then I've asked: isn't it possible to use a single source repository and use feature flags to customize appearance and features of each portals. Original authors said that it is impossible.

In 2-3 months I've merged the code of 4-5 portals into one repository, added feature flags, upgraded the framework version, release went flawlessly, and it was possible to fix a bug simultaneously for all the portals or develop a new functionality available across all the countries where the company operated. It was a huge relief for me as copying bugfixes manually was tedious and error-prone process.

free_bip 5 days ago|||

I once had to deal with some contractors that habitually did this, when confronted on how this could lead to confusion they said "that's what Ctrl+F is for."

ctrl4 4 days ago||

Oh boy! This reminded me of one of my worst tech leads. He pushed secret tokens to github. When I asked in the team meeting why would we do this instead of using secrets manager, the response was: "These are private respos. Also we signed an NDA before joining the company"

supportengineer 5 days ago|||

Was this in Blacksburg by any chance?

jfengel 5 days ago||

It was indeed! Back in the late 80s. You know of it?

It was so long ago it feels half mythical to me.

nyarlathotep_ 4 days ago|||

> Bonus: the customer was the Department of Energy, and the program managed nuclear material inventory. Sleep tight.

These are my favorite (in a sense) programmer stories--that there's these incomprehensible piles of rubbish that somehow, like, run The World and things, and yet somehow things manage to work (in an outwardly observable sense).

Although, I recall two somewhat recent stories where this wasn't the case. The unemployment benefits fiascos during early Covid-era, and some more recent air traffic control-related things (one which effected me personally).

barbaracomell 5 days ago||

[dead]

dang 5 days ago||

Related. Others?

Negative 2000 Lines of Code (1982) - https://news.ycombinator.com/item?id=33483165 - Nov 2022 (167 comments)

-2000 Lines of Code - https://news.ycombinator.com/item?id=26387179 - March 2021 (256 comments)

-2000 Lines of Code - https://news.ycombinator.com/item?id=10734815 - Dec 2015 (131 comments)

-2000 lines of code - https://news.ycombinator.com/item?id=7516671 - April 2014 (139 comments)

-2000 Lines Of Code - https://news.ycombinator.com/item?id=4040082 - May 2012 (34 comments)

-2000 lines of code - https://news.ycombinator.com/item?id=1545452 - July 2010 (50 comments)

-2000 Lines Of Code - https://news.ycombinator.com/item?id=1114223 - Feb 2010 (39 comments)

-2000 Lines Of Code (metrics == bad) (1982) - https://news.ycombinator.com/item?id=1069066 - Jan 2010 (2 comments)

Note for anyone wondering: reposts are ok after a year or so (https://news.ycombinator.com/newsfaq.html).In addition to it being fun to revisit perennials sometimes (though not too often), this is also a way for newer cohorts to encounter the classics for the first time—an important function of this site!

j4pe 5 days ago||

I am a simple man I see -2k lines of code, I upvote

I've told this story to every client who tried schemes to benchmark productivity by some single-axis metric. The fact that it was Atkinson demonstrates that real productivity is only benchmarkable by utility, and if you can get a truly accurate quantification for that then you're on the shortlist for a Nobel in economics.

cb321 4 days ago||

Important enough to re-state whenever it arises - once you have 2 or more axes/dimensions, you no longer have a linear ordering. You need to map back to a number line to "compare". This is the motivation or driving force toward your "single axis". { That doesn't mean it's a goal any easier to realize, though. I am attempting to merely clarify/amplify rather than dispute here.. }

LeifCarrotson 4 days ago|||

This story is particularly relevant now, as Bill passed away 3 weeks ago. There was a post about this on the front page at the time:

Bill Atkinson has died - https://news.ycombinator.com/item?id=44210606 - June 7, 2025 (277 comments)

I didn't see that post, but I'm glad we're able to remember Bill through humorous anecdotes and eternally relevant lessons like this.

Scuds 4 days ago||

I figured that articles like folklore are like an amusing movie file (say someone chopping a skin of a watermelon) that's repeatedly being passed around reddit.

abraxas 5 days ago||

An old Dilbert cartoon had the pointy haired boss declare monetary rewards for every fixed bug in their product. Wally went back to his desk murmuring "today I'm going to code me a minivan!"

LordDragonfang 5 days ago||

https://i.imgur.com/tyXXh1d.png

My manager has it pinned on the breakroom wall.

mojo74 4 days ago|||

The Perverse incentive: https://en.wikipedia.org/wiki/Perverse_incentive

boboddy 4 days ago|||

Now I'm wondering if this story[0] I read long ago is just a written form of the comic, or if any company actually tried this.

[0]: https://thedailywtf.com/articles/The-Defect-Black-Market

synecdoche 4 days ago|||

Goodhart's law - When a measure becomes a target, it ceases to be a good measure

bravesoul2 5 days ago||

Sorry what's the minivan reference?

amoshebb 5 days ago|||

it's just a stand-in for "expensive but relatable purchase". He's saying "I'm about to write so many bugs that the sum reward will be in the tens of thousands"

windward 4 days ago||

It's also foolish. Any true PHB has caps on bonuses linked to output that would be too low to enable a minivan.

angus-g 5 days ago||||

I assume a cycle of write bug -> fix bug -> get paid until they can afford a new car!

bombcar 4 days ago|||

It would have been a sports car but Wally’s not the type.

pjdesno 4 days ago||

The strip came out in 1995, at the peak of the minivan boom, with around 1.3M units sold that year.

dml2135 4 days ago||

I've become something of the guy that's the main code remover at my current job. Part of it is because I've been here the longest on the team, so I've got both the knowledge and the confidence to say a feature is dead and we can get rid of it. But also part of it is just being the one to go in and clean up things like release flags after they've gone live in prod.

I'm trying to socialize my team to get more in the habit of this, but it's been hard. It's not so much that I get pushback, it's just that tasks like "clean up the feature flag" get thrown into the tech debt pile. From my perspective, that's feature work, it just happens to take place after the feature goes live instead of before. But it's work that we committed to when we decided to build the feature, so no, you don't get to put it on the tech debt board like it was some unexpected issue that came up during development.

Curious to hear other perspectives here, I do worry that I'm a bit too dogmatic about this sometimes. Part of it maybe comes from working in shared art / maker spaces a lot in the past, where "clean up your shit" was rule #1, and I kind of see developers leaving unused code throughout the codebase for features they owned through the same lens.

schindlabua 4 days ago||

I probably spend 30% of time on refactoring. Deduplicating common things different people have done, adding seperating layers between old shitty code and the fancy new abstractions, adding friction to some areas to discourage crossing module boundaries, that sort of thing.

For some reason new devs keep telling me how easy it is to implement features.

Really wonder why that is. The managers keep telling me that refactoring is a nice-to-have thing and not necessary and maybe we have time next sprint.

You just have to do it without telling anyone, it improves velocity for everyone. It's architecture work on the small scale.

maxwellg 4 days ago|||

On days I write code, I try to do one "cleanup" PR a day just to get myself warmed up. Sometimes it is removing a feature flag, sometimes it is rewriting a file to use some new standards like a better logger library or test pattern. None of this is ticketed work, and if something takes longer than ten minutes or so I drop it and work on whatever I was going to work on originally. Make (trivial) cleanups a fun treat and a break from real work and it is easier to get other people excited about them.

Of course, lately anything trivial I ask codex to do - but there is still fun in figuring out what trivial thing I should have it take on next.

wijwp 4 days ago|||

Cleanup doesn't get me a raise or promoted. In a world with constant threats of layoffs, cleanup may even be penalized depending on what's rewarded. "Clean up your shit" doesn't work when my job is on the line.

It needs to be rewarded properly to be prioritized.

MetaWhirledPeas 4 days ago|||

> I do worry that I'm a bit too dogmatic about this sometimes

I haven't seen a lot of other good suggestions for how to accomplish this, so maybe you're being just the right amount of dogmatic.

slippy 4 days ago|||

Cleaning up of feature flags was something that I excelled at failing to do. If you are the one cleaning them up, then you sir deserve a raise. Don't question it. It's a service.

akdor1154 4 days ago||

> the tech debt board

Taking you to literally mean you have a separate board for tech debt, that's your problem right there.

dml2135 4 days ago||

Well, we prioritize amongst the tech debt on that board and then move it onto the main board for sprint, it's not like it's a completely separate process. Things do go there to die sometimes though.

runfaster2000 5 days ago||

This is a good example[1] at 64k LOC removal. We removed built-in support for C# + WinRT interop on Windows and instead required users to use a source-generation tool (which is still the case today). This was a breaking change. We realized we had one chance to do this and took it.

[1] https://github.com/dotnet/runtime/pull/36715/files

conartist6 5 days ago||

I think of this story every time I see a statistic about how much LLMs have "increased the productivity" of a developer

tuveson 5 days ago||

Don't be too hard on AI, it can delete code too!

https://forum.cursor.com/t/cursor-yolo-deleted-everything-in...

tonyedgecombe 4 days ago||

I love the way the "Community Ambassador" steps in and offers solutions to this problem after it has happened.

Chris_Newton 5 days ago|||

Or the current industry favourite, “X% of our new code is now written by AI!”

Cthulhu_ 4 days ago|||

Microsoft, the number being 30%; whether that's accurate is another matter. Twenty years ago people already used IDEs to generate boilerplate code (remember Java's getters/setters/hashCode/toString?) because some guy in a book said you had to.

thomashop 4 days ago||||

I use AI to simplify code. My manifesto has always been code is debt. Works really well too.

nyarlathotep_ 3 days ago||||

In Google's case, outside of LLMs I've always wondered how much code was generated by the protocol buffer compiler.

cb321 4 days ago|||

LOL. There was a time when people were excoriated for committing generated object code into version control..

1970-01-01 5 days ago||

Including the cost to build and maintain new nuclear power plants takes developers' efficiency into absurdity.

impostervt 4 days ago||

About 1.5 years ago I inherited a project with ~ 250,000 lines of code - in just the web UI (not counting back end).

The developer who wrote it was a smart guy, but he had never worked on any other JS project. All state was stored in the DOM in custom attributes, .addEventListeners EVERYWHERE... I joke that it was as if you took a monk, gave him a book about javascript, and then locked him in a cell for 10 years.

I started refactoring pieces into web components, and after about 6 months had removed 50k lines of code. Now knowing enough about the app, I started a complete rewrite. The rewrite is about 80% feature parity, and is around 17k lines of code (not counting libraries like Vue/pinia/etc).

So, soon, I shall have removed over 200,000 loc from the project. I feel like then I should retire as I will never top that.

motorest 4 days ago||

> The rewrite is about 80% feature parity, and is around 17k lines of code (not counting libraries like Vue/pinia/etc).

This is exactly where these comparisons break down. Obviously you don't need as much code to get passable implementations of a fraction of all the features.

Philip-J-Fry 4 days ago|||

It's definitely a good argument for not reinventing the wheel though.

I'd rather have 250,000 lines of code but 230,000 of that is in battle tested libraries. And of which only 20,000 lines are what we ever need to read/write.

vendiddy 4 days ago||

I will frequently extract OSS style libraries out of our app and put them in a packages/ folder.

sshine 4 days ago||||

>> is about 80% feature parity, and is around 17k lines of code

You make a fair point that a basic framework can be expressed with much less code.

And that the remaining 20% probably contains more edge cases with proportionally more code.

But do you think the last 20% will eventually make up anywhere near 233k lines of code?

The real save here comes from rewriting: seeing all the common denominators and knowing what's ahead.

williamdclt 4 days ago|||

I mean, you can get basic implementations of Vue and state management libs in a few hundred (maybe thousand?) LOCs (lots of examples on the interweb) that are probably less "toyish" than whatever this person had handrolled

Cthulhu_ 4 days ago||

> I joke that it was as if you took a monk, gave him a book about javascript, and then locked him in a cell for 10 years.

I've had a similar experience (see other comment), the original author was a junior developer at best, but unfortunately, a middle-aged, experienced developer, one of the founders of the company, and very productive. But obviously, not someone who had ever worked in a team or who had someone else work on their codebase.

Think functions thousands of lines long, nested switch/case/if/else/ternary things ten levels deep, concatenated SQL queries (it was PHP because of course), concatenated JS/HTML/HTML-with-JS (it was Dojo front-end), no automated tests of any sort, etc.

vodou 5 days ago||

A long time ago I was working in a big project where the PLs came up with the most horrible metric I've ever seen. They made a big handwritten list, visible for the whole team, where they marked for each individual developer how many bugs they had fixed and how many bugs they had caused.

I couldn't believe my eyes. I was working in my own project beside this team with the list, so thankfully I was left out of the whole disaster.

A guy I knew wasn't that lucky. I saw how he suffered from this harmful list. Then I told him a story about the Danish film director Lars von Trier I recently had heard. von Trier was going to be chosen to appear in a "canon" list of important Danish artists that the goverment was responsible for. He then made a short film where he took the Danish flag (red with a white cross) and cut out the white lines and stitched it together again, forming a red communist flag. von Trier was immediately made persona non grata and removed from the "canon".

Later that day my friend approached the bugs caused/fixed list, cut out his own line, taped it together and put it on the wall again. I never forget how a PL came in the room later, stood and gazed at the list for a long time before he realized what had happened. "Did you do this?" he asked my friend. "Yes", he answered. "Why?", said the PL. "I don't want to be part of that list", he answered. The next day the list was gone.

A dear memory of successful subversion.

fathomdeez 5 days ago||

I'm having a lot of trouble visualizing both the flag and the list modifications.

voidUpdate 4 days ago|||

The danish flag is a white cross on a red background. If you cut out the white cross, you will be left with four rectangles of red, which can be pushed together and sewn up again, forming a solid red flag

fathomdeez 4 days ago||

I see. I was wondering where the hammer and sickle came in, since without those is it really a communist flag?

_dain_ 4 days ago||

They were red as well, so you couldn't see them

bombcar 4 days ago|||

Took me two reads but he cut his line out of the list, taped it back together and replaced the list on the wall, without his line.

Cthulhu_ 4 days ago||

> "I don't want to be part of that list"

Simple, to the point, love it. "I'm not playing your stupid management games".

justtinker 4 days ago|

In the days when perl was the language of choice for the web I got a 97% reduction in code size. I was asked to join a late project to speed it up. (Yes I know that has low success rate).

The lead dev was a hard core c programmer and had no perl experience before this job. He handed me a 200 line uncommented function that he wrote and was not working. It was a pattern matcher. I replaced it with 6 lines of commented perl with regex that was very readable (for a regex).

Since he had no idiomatic understanding of perl he did not accept it and complained to management. We had to bring in the local perl demigod to arbitrate(at 21 was half my age at the time, but smart as a whip). Ruled in my favor and the lead was pissed.

ryao 4 days ago|

Was he unaware of regex.h?

https://www.man7.org/linux/man-pages/man3/regcomp.3p.html

sgerenser 4 days ago||

Doing regex in C back in the day was not very common and far from idiomatic, unlike perl where its basically expected that you cram regexes in anywhere you can.

More comments...