Top
Best
New

Posted by todsacerdoti 10/27/2025

The new calculus of AI-based coding(blog.joemag.dev)
205 points | 212 commentspage 2
jcgrillo 10/28/2025|
> For our team, every commit has an engineer's name attached to it, and that engineer ultimately needs to review and stand behind the code.

Then they claim (and demonstrate with a picture of a commits/day chart) a team-wide 10x throughput increase. I claim there's got to be a lot of rubber-stamp reviewing going on here. It may help to challenge the "author" to explain things like "why does this lifetime have the scope it does?" or "why did you factor it this way instead of some other way?" e.g. questions which force them to defend the "decisions" they made. I suspect if you're actually doing thorough reviews that the velocity will actually decrease instead of increase using LLMs.

zkmon 10/28/2025||
Is it exciting because work happens at 200mph or is it because you get that much business advantage against your competition? Or is it because, now it allows you to spend only one hour at work per day?

To quote Joey from Friends - "400 bucks are gone from my pocket and nobody is getting happier?"

gachaprize 10/27/2025||
Classic LLM article:

1) Abstract data showing an increase in "productivity" ... CHECK

2) Completely lacking in any information on what was built with that "productivity" ... CHECK

Hilarious to read this on the backend of the most widely publicized AWS failure.

tomhow 10/28/2025||
Please don't post shallow, snarky dismissals on HN. The guidelines ask us to be more thoughtful in the way we respond to things:

https://news.ycombinator.com/newsguidelines.html

fransje26 10/28/2025|||
Ironic that you refer to the guidelines.

The first paragraph of said guidelines reads:

  What to Submit

  On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.
And yet, the original submission was just another version of the trope I used AI to boost my productivity 10 fold, and it was all roses and butterflies.

After the n-th iteration of the same self-congratulating, hype-pushing, AI-generated drivel, the point can be made that the original submission does not meet the HN guidelines.

To quote Jimmy in South-Park: it's an ad.

tomhow 10/28/2025||
If the article isn't fit for HN, people can flag it, and if they really want to, they're welcome to email us at hn@ycombinator.com to point out what's wrong with it and we'll consider further penalties.

Comments like the one I replied too just make HN seem mean and miserable and that's definitely something we're trying to avoid.

Dilettante_ 10/28/2025|||
GP is making very specific points of contention about the article.
tomhow 10/28/2025||
It's the first ever comment by that account, and it seems to contain fulmination and snark, both of which are explicitly against the guidelines, as is curmudgeonliness. Criticism is fine, but it needs to be substantive. If the article isn't fit for HN, people can flag it, and if they really want to, they're welcome to email us at hn@ycombinator.com to point out what's wrong with it and we'll consider further penalties.

Comments like the one I replied too just make HN seem mean and miserable and that's definitely something we're trying to avoid.

alfalfasprout 10/27/2025|||
Yep. The problem is then leadership sees this and says "oh, we too can expect 10x productivity if everyone uses these tools. We'll force people to use them or else."

And guess what happens? Reality doesn't match expectations and everyone ends up miserable.

Good engineering orgs should have engineers deciding what tools are appropriate based on what they're trying to do.

gtsop 10/27/2025||
[dead]
Gud 10/28/2025||
People bemoan unrecognisable code bases, maybe that is true.

I’ve sure used various LLMs to solve difficult nuts to crack. Problems I have been able to verbalise, but unable to solve.

Chances are that if you are using an LLM to mass produce boiler, you are writing too much boiler.

rob_c 10/28/2025||
Trust but verify. It's not hard.

The corollary being. If you can't (through skill or effort) verify don't trust.

If you break this pattern you deserve all the follies that become you as a "professional".

StilesCrisis 10/28/2025||
The biggest thing that stood out to me was that they suddenly started working nonstop, even on weekends…? If AI is so great, why can’t they get a single day off in two months?
bcrosby95 10/27/2025||
It's amazing that their metrics exactly match the mythical "10x engineer" in productivity boost.
Naklin 10/28/2025|
Without disclosing said metrics or any data...

This is really something striking to me about all these AI productivity claims. They never provide the methodology and data.

quikoa 10/28/2025||
They rarely even provide the projects either or even the type of project. I'd like to see all these awesome results that are build with AI (preferably not web related).
exasperaited 10/27/2025||
Absolutely none of that article has ever even so much as brushed past the colloquial definition of "calculus".

These guys actually seem rattled now.

photochemsyn 10/27/2025||
Well, 'calculus' is the kind of marketing word that sounds more impressive than 'arithmetic' and I think 'quantum logic' has gone a bit stale, and 'AI-based' might give more hope to the anxious investor class, as 'AI-assisted' is a bit weak as it means the core developer team isn't going to be cut from the labor costs on the balance sheet, they're just going to be 'assisted' (things like AI-written unit tests that still need some checking).

"The Arithmetic of AI-Assisted Coding Looks Marginal" would be the more honest article title.

photonthug 10/27/2025|||
Yes, unfortunately a phrase that's used in an attempt to lend gravitas and/or intimidate people. It sort of vaguely indicates "a complex process you wouldn't be interested in and couldn't possibly understand". At the same time it attempts to disarm any accusation of bias in advance by hinting at purely mechanistic procedures.

Could be the other way around, but I think marketing-speak is taking cues here from legal-ese and especially the US supreme court, where it's frequently used by the justices. They love to talk about "ethical calculus" and the "calculus of stare decisis" as if they were following any rigorous process or believed in precedent if it's not convenient. New translation from original Latin: "we do what we want and do not intend to explain". Calculus, huh? Show your work and point to a real procedure or STFU

collingreen 10/27/2025|||
"Galaxy-brain pair programming with the next superintelligence"
keeda 10/28/2025||
This article is right, but I think it may underplay the changes that could be coming soon. For instance, as the top comment here about TDD points out, the actual code does not matter anymore. This is an astounding claim! And it has naturally received a lot of objections in the replies.

But I think the objections can mostly be overcome with a minor adjustment: You only need to couple TDD with a functional programming style. Functional programming lets you tightly control the context of each coding task, which makes AI models ridiculously good at generating the right code.

Given that, if most of your code is tightly-scoped, well-tested components implementing orthogonal functionality, the actual code within those components will not matter. Only glue code becomes important and that too could become much more amenable to extensive integration testing.

At that point, even the test code may not matter much, just the test-cases. So as a developer you would only really need to review and tweak the test cases. I call this "Test-Case-Only Development" (TCOD?)

The actual code can be completely abstracted away, and your main task becomes design and architecture.

It's not obvious this could work, largely because it violates every professional instinct we have. But apparently somebody has even already tried it with some success: https://www.linkedin.com/feed/update/urn:li:activity:7196786...

All the downsides that have been mentioned will be true, but also may not matter anymore. E.g. in a large team and large codebase, this will lead to a lot of duplicate code with low cohesion. However, if that code does what it is supposed to and is well-tested, does the duplication matter? DRY was an important principle when the cost of code was high, and so you wanted to have as much leverage as possible via reuse. You also wanted to minimize code because it is a liability (bugs, tech debt, etc.) and testing, which required even more code that still didn't guarantee lack of bugs, was also very expensive.

But now that the cost of code is plummeting, that calculus is shifting too. You can churn out code and tests (including even performance tests, which are always an afterthought, if thought of at all) at unimaginable rates.

And all this while reducing the dependencies of developers on libraries and frameworks and each other. Fewer dependencies means higher velocity. The overall code "goodput" will likely vastly outweight inefficiences like duplication.

Unfortunately, as TFA indicates, there is a huge impedance mismatch with this and the architectures (e.g. most code is OO, not functional), frameworks, and processes we have today. Companies will have to make tough decisions about where they are and where they want to get.

I suspect AI-assisted coding taken to its logical conclusion is going to look very different from what we're used to.

erichocean 10/28/2025|
I seem to get way better results from AI coding than my peers, but I'm also doing functional programming. Maybe that's why.

> I suspect AI-assisted coding taken to its logical conclusion is going to look very different from what we're used to.

100%. I now design new libraries so that AI can easily write code for them.

Madmallard 10/28/2025|
Correct TDD involves solving all the hard problems in the process. What gain does AI give you then?
More comments...