An AI coding agent, used to write code, needs to reduce your maintenance costs

Posted by cratermoon 11 hours ago

An AI coding agent, used to write code, needs to reduce your maintenance costs(www.jamesshore.com)

179 points | 43 comments

richardbarosky 8 hours ago|

Insightful. Agree with this take.

Unfortunately, maintainability is simply bucketed as a "non-functional" requirement.

Maintainability (and similar NFRs) should actually be considered what preserves and enables the delivery of future functional requirements -- in contrast to framing non-functional requirements as simply "how" the software must do what it does vs. the "what"/functional requirements that "actually matter".

From that standpoint, if a steady flow of features/improvements is important for a project, maintainability isn't really a non-functional requirement at all, and amounts to being a functional requirement, in practice, over anything except the shortest of time horizons.

Jenk 3 hours ago||

I've found the first, and most important, step for any team or organisation to eliminate concerns with NFRs, "tech debt", and whatever else it may be called, is to stop giving it a name.

I'm being completely serious. By giving it some kind of distinct name, you are giving license to it being ring-fenced and de-prioritised by someone who doesn't (but, arguably, probably should) know better.

Quality matters. It hits your P&L very quickly and very hard if you don't maintain it. So it is as important as any other factor.

Exoristos 1 hour ago||

Name it "not done yet." But, yes, very keen observation here.

bluefirebrand 5 hours ago||

> amounts to being a functional requirement, in practice, over anything except the shortest of time horizons

Right! The unfortunate thing is that many software companies don't seem to think much further than a quarter ahead, not really.

Sure they might have a product roadmap that extends for a year or two into the future, but let's be honest. Often that roadmap is mostly for sales purposes, not engineering planning purposes. Product and engineering will pivot if sales slump. The earlier in the company's lifespan, the more likely this will happen often

However if companies get out of this startup mode then they should start to stabilize... But many don't. They continue this pattern of short sighted short term planning, which means product stability remains a low priority effort.

Ultimately I guess many companies just either do not have the resources to build good software or do not actually care to

keithnz 8 hours ago||

In my experience AI reduces maintenance costs. Though, context might matter here, I'm working on a multi decade set of projects, while there is a lot of greenfield feature development, the old code / older projects have suddenly become a lot easier to work with, modernize, and in a bunch of cases, eliminated. Dependency on old libraries, build tools, in some cases updated, in other cases just eliminated, builds are faster, easier for developers, etc. End to end testing has become a lot easier to setup and automate. DevOps have been improved a lot, diagnosing production issues drastically improved, we have a ton of logs and information, and while we have various consolidated dashboards / monitoring to capture critical things, now we can do a lot more analysis on our deployed system (~50 ish projects)

theteapot 6 hours ago|

This rings true for me too, but I don't think it counts if your just using AI to aid maintenance. The basic argument in the article is around how many hours of maintenance you have to do for each hour of "value-add" feature development. So A. your only measuring maintenance costs not the ratio and B. The "old code" whp wasn't written with AI in the first place.

dirkc 3 hours ago||

Two things I'd add

1. software doesn't only have tech maintenance - there is also user support and it increases as software grows.

2. I'm not convinced maintenance costs scale linearly. And even if it scales linearly, you will eventually get to a point where maintenance takes up all your time.

m463 9 hours ago||

Same with code reviews.

I wonder if AI could make code reviews more presentable.

for example, with human code reviews, developers learn quickly not to visually change code like reflowing code or comments, changing indent (where the tools can't suppress it), moving functions around or removing lines or other spurious changes.

And don't refactor code needlessly.

also, could break reviews up into two reviews - functional changes and cosmetic changes.

jpollock 1 hour ago||

Do any refactorings in separate reviews, and say things like "REFACTOR_ONLY:", with a rule that none of the code changes behavior.

That makes reviews a lot easier. The review starts from "nothing should be changing" and then reviewers can pattern match on that.

Otherwise, the reviewer is re-evaluating every line of code to make sure nothing has changed. That's really hard to do properly.

The version control systems I've worked with have allowed queues of changes, each one reviewed independently. As I'm developing, if I need a refactor, I go up a commit, refactor, send out for review, rebase my in progress work and continue.

I send out a continual stream of "CLEANUP:" "REFACTOR_ONLY:", and similar changes with the final change being a lot smaller than a big monster of a change.

Your reviewers will appreciate the effort.

Plays the metric game (if you're working in that type of org) without being evil too.

crooked-v 7 hours ago||

https://github.com/ReviewStage/stage-cli looks like an interesting start on that subject.

whattheheckheck 5 hours ago||

And nwave

https://github.com/nWave-ai/nWave

They have /nw-buddy to point you in the right direction

Very nifty

afro88 1 hour ago||

The bet that he misses, which a lot of companies are starting to make or at least think about, is that AI will get better at coding. So the model / harness / whatever is next takes care of the maintenance burden.

That's the theory anyway.

hona_mind 4 hours ago||

The article's framing around the maintenance-to-feature ratio resonates with something I've been noticing in my own workflow.

One underappreciated aspect: the artifact surface area of an AI session grows much faster than the code surface area. For every hour of Claude Code output, you get not just code changes but screenshots, generated images, exported transcripts, spec drafts, downloaded model weights — all scattered across wherever Finder happened to drop them.

The maintenance cost argument applies here too. If you can't quickly navigate to the right artifact at the right moment, you end up re-generating things you already have, or worse, losing context between sessions. The "maintenance" of your working environment is a real tax on the ratio the article is describing.

I've been trying to address the file-side of this problem specifically, but the broader point stands: AI coding agents will only reduce net maintenance costs if the surrounding tooling (file management, context switching, artifact organization) keeps pace.

throwthrowuknow 28 minutes ago||

This could have been a good piece of writing if the author chose not to be so smugly overconfident in their belief and show real evidence to support their claim. Mentioning the front page of HN as your source is glib and immediately made me doubt the conclusions. I was interested to see what work the author put into researching this but apparently they didn’t do any work at all.

When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

frumiousirc 9 minutes ago|

> When an LLM provides you with an overconfident piece of writing with no sources to back it up, what do you do?

You draw made up lines on made up plots and call it evidence, obviously.

joshka 5 hours ago||

I feel like AI might let us model some of the things that we initially didn't scope that led to these problems (e.g. "Decided not to fix every bug, or upgrade every dependency") - being able to more easily ask a system that can dig into "how much time are we spending on stuff related to foo"

AI tooling can also be a place where we start building our view of what maintainable software practices look like so we don't make decisions that have these same tail effort profiles. That can be things like building out tooling to handle maintenance updates

I think the real thing that comes out of AI tooling is probably that the tooling needs to be trained (or steered) towards activities that enhance human attention management.

yurishimo 4 hours ago|

> AI tooling can also be a place where we start building our view of what maintainable software practices look like so we don't make decisions that have these same tail effort profiles. That can be things like building out tooling to handle maintenance updates

This has been possible already but from my vantage point, it doesn't look like anyone really did it? Sure, there already exists tons of OSS that is built for this case, even before AI, yet it seems to me to always come back to incentives. IMO, there is no incentive to write maintainable software (and I'm not sure there ever will be one at this pace). Businesses are only incentivized to write enough software to accomplish the task within their own defined SLAs and nothing further. But even that doesn't seem to be a blocker at this point if Github is used as an example.

Good software comes from people who care deeply about solving the problems in way that they are invested in. If your employees don't care about your product, you're already starting on the wrong foot. AI isn't going to incentivize bad-average developers to write better software or a good developer to push back harder against their clueless manager. When they make the decision, AI might help (assuming it doesn't make a bigger mess) but it's not going to reduce technical debt in any meaningful way without a sea change of perspective from product managers around the world.

So far, I just don't see it happening in theory or in practice. I hope I'm proven wrong!

joshka 3 hours ago||

I think I have a different perspective on this because I've worked in places that do care about that sort of thing on tools that do focus on those sorts of things. I think the long term incentive for these tools to address tech debt as a goal comes from the AI eval benchmarks trending towards being saturated. The advantages of one tool over another will be in the longer context things. This naturally will tend to start to act as a forcing function for training to focus on the longer tail of software development. A good way of thinking about this is GPT 3.5 was good at dealing well with lines of code and functions, 4 was functions, small apps, 5 seems adept at delivering apps and systems, 6 will be systems and whole enterprised programs of work.

ianmarcinkowski 7 hours ago||

My low value comment. This feels directionally correct to me. The problems I've been struggling with in my dev job for the past 6 months have been 80% maintenance/legacy code interfering with new feature development.

Some of our developers are overly aggressive about using AI and I've started going down that path because I need to keep up and actually enjoy the flow of working with AI in my IDE.

I put a lot of work into keeping my area of the codebase understandable and coherent but I do not see that from the others on our team. I'm not perfect but I and extremely sensitive to incoherent, or un-grok-able at a glance.

Anyway, I like the novel (to me at least) framing of this article!

stevepotter 9 hours ago|

For me, if I can make a kickass testing system that people love so much that they actually build features with it and it’s not an afterthought, then maintenance becomes much easier. It’s often called test driven development but I’ve rarely seen it done in such a way that the dev ex is good enough for it to work.

But say you have that. Then you have great profiling. At that point you can measure correctness and performance. Then implementation becomes less of a focal point. And that makes it a lot easier to concede coding to ai

NotGMan 9 hours ago|

This will probably be how things will work in future: devs will shift to specifying features which will be validate through tests.

The AI will then be middle layer that will iterate until tests pass.

Layer 1: Specs (Humans)

Layer 2: Code (AI mostly)

Layer 3: Tests (AI + human checks).

visarga 6 hours ago|||

Yes, that is how I see it too. What I would add is - intent testing - collect user messages, and check them against executed work from time to time. Every ask must be implemented and tested, every code must be justified by a user message.

jplusequalt 8 hours ago|||

What a boring fucking future.

bluefirebrand 5 hours ago|||

No kidding. AI does all the interesting problem solving and humans...

Write tests. The most boring activity on the planet

More comments...