The new calculus of AI-based coding

Posted by todsacerdoti 2 days ago

The new calculus of AI-based coding(blog.joemag.dev)

193 points | 209 commentspage 2

Gud 2 days ago|

People bemoan unrecognisable code bases, maybe that is true.

I’ve sure used various LLMs to solve difficult nuts to crack. Problems I have been able to verbalise, but unable to solve.

Chances are that if you are using an LLM to mass produce boiler, you are writing too much boiler.

gachaprize 2 days ago||

Classic LLM article:

1) Abstract data showing an increase in "productivity" ... CHECK

2) Completely lacking in any information on what was built with that "productivity" ... CHECK

Hilarious to read this on the backend of the most widely publicized AWS failure.

tomhow 2 days ago||

Please don't post shallow, snarky dismissals on HN. The guidelines ask us to be more thoughtful in the way we respond to things:

https://news.ycombinator.com/newsguidelines.html

fransje26 2 days ago|||

Ironic that you refer to the guidelines.

The first paragraph of said guidelines reads:

  What to Submit

  On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.

And yet, the original submission was just another version of the trope I used AI to boost my productivity 10 fold, and it was all roses and butterflies.

After the n-th iteration of the same self-congratulating, hype-pushing, AI-generated drivel, the point can be made that the original submission does not meet the HN guidelines.

To quote Jimmy in South-Park: it's an ad.

tomhow 2 days ago||

If the article isn't fit for HN, people can flag it, and if they really want to, they're welcome to email us at hn@ycombinator.com to point out what's wrong with it and we'll consider further penalties.

Comments like the one I replied too just make HN seem mean and miserable and that's definitely something we're trying to avoid.

Dilettante_ 2 days ago|||

GP is making very specific points of contention about the article.

tomhow 2 days ago||

It's the first ever comment by that account, and it seems to contain fulmination and snark, both of which are explicitly against the guidelines, as is curmudgeonliness. Criticism is fine, but it needs to be substantive. If the article isn't fit for HN, people can flag it, and if they really want to, they're welcome to email us at hn@ycombinator.com to point out what's wrong with it and we'll consider further penalties.

Comments like the one I replied too just make HN seem mean and miserable and that's definitely something we're trying to avoid.

alfalfasprout 2 days ago|||

Yep. The problem is then leadership sees this and says "oh, we too can expect 10x productivity if everyone uses these tools. We'll force people to use them or else."

And guess what happens? Reality doesn't match expectations and everyone ends up miserable.

Good engineering orgs should have engineers deciding what tools are appropriate based on what they're trying to do.

gtsop 2 days ago||

[dead]

StilesCrisis 2 days ago||

The biggest thing that stood out to me was that they suddenly started working nonstop, even on weekends…? If AI is so great, why can’t they get a single day off in two months?

keeda 2 days ago||

This article is right, but I think it may underplay the changes that could be coming soon. For instance, as the top comment here about TDD points out, the actual code does not matter anymore. This is an astounding claim! And it has naturally received a lot of objections in the replies.

But I think the objections can mostly be overcome with a minor adjustment: You only need to couple TDD with a functional programming style. Functional programming lets you tightly control the context of each coding task, which makes AI models ridiculously good at generating the right code.

Given that, if most of your code is tightly-scoped, well-tested components implementing orthogonal functionality, the actual code within those components will not matter. Only glue code becomes important and that too could become much more amenable to extensive integration testing.

At that point, even the test code may not matter much, just the test-cases. So as a developer you would only really need to review and tweak the test cases. I call this "Test-Case-Only Development" (TCOD?)

The actual code can be completely abstracted away, and your main task becomes design and architecture.

It's not obvious this could work, largely because it violates every professional instinct we have. But apparently somebody has even already tried it with some success: https://www.linkedin.com/feed/update/urn:li:activity:7196786...

All the downsides that have been mentioned will be true, but also may not matter anymore. E.g. in a large team and large codebase, this will lead to a lot of duplicate code with low cohesion. However, if that code does what it is supposed to and is well-tested, does the duplication matter? DRY was an important principle when the cost of code was high, and so you wanted to have as much leverage as possible via reuse. You also wanted to minimize code because it is a liability (bugs, tech debt, etc.) and testing, which required even more code that still didn't guarantee lack of bugs, was also very expensive.

But now that the cost of code is plummeting, that calculus is shifting too. You can churn out code and tests (including even performance tests, which are always an afterthought, if thought of at all) at unimaginable rates.

And all this while reducing the dependencies of developers on libraries and frameworks and each other. Fewer dependencies means higher velocity. The overall code "goodput" will likely vastly outweight inefficiences like duplication.

Unfortunately, as TFA indicates, there is a huge impedance mismatch with this and the architectures (e.g. most code is OO, not functional), frameworks, and processes we have today. Companies will have to make tough decisions about where they are and where they want to get.

I suspect AI-assisted coding taken to its logical conclusion is going to look very different from what we're used to.

erichocean 2 days ago|

I seem to get way better results from AI coding than my peers, but I'm also doing functional programming. Maybe that's why.

> I suspect AI-assisted coding taken to its logical conclusion is going to look very different from what we're used to.

100%. I now design new libraries so that AI can easily write code for them.

bcrosby95 2 days ago||

It's amazing that their metrics exactly match the mythical "10x engineer" in productivity boost.

Naklin 2 days ago|

Without disclosing said metrics or any data...

This is really something striking to me about all these AI productivity claims. They never provide the methodology and data.

quikoa 1 day ago||

They rarely even provide the projects either or even the type of project. I'd like to see all these awesome results that are build with AI (preferably not web related).

brazukadev 2 days ago||

But here's the critical part: the quality of what you are creating is way lower than you think, just like AI-written blog posts.

collingreen 2 days ago|

Upvoted for dig that is also an accurate and insightful metaphor.

r0x0r007 2 days ago||

"For me, roughly 80% of the code I commit these days is written by the AI agent" Therefore, it is not commited by you, but by you in the name of AI agent and the holy slop. What to say, I hope that 100x productivity is worth it and you are making tons of money. If this stuff becomes mainstream, I suggest open source developers stop doing the grind part, stop writing and maintaining cool libraries and just leave all to the productivity guys, let's see how far they get. Maybe I've seen too many 1000x hacker news..

visarga 2 days ago||

Just need the feedback to follow suit to be 100x as effective. Tests, docs and rapid loops of guidance with human in the loop. Split your tasks, find the structure that works.

ChadNauseam 2 days ago||

I think it's fine. For example, "I" made this library https://github.com/anchpop/weblocks . It might be more accurate to say that I directed AI to make it, because I didn't write a line of code myself. (And I looked at the code and it is truly terrible.) But I tested that it works, and it does, and it solves my problem perfectly. Yes, it is slop, but this is a leaf node in the abstraction graph, and no one needs to look at it again now that it it written

shakna 2 days ago|||

Most code, though, is not write once and ignore. So it does matter if its crap, because every piece of software is only as good as its least dependency.

Fine for just you. Not fine for others, not fine for business, not fine the moment you star count starts moving.

cadamsdotcom 2 days ago||

"We have real mock versions of all our dependencies!"

Congratulations, you invented end-to-end testing.

"We have yellow flags when the build breaks!"

Congratulations! You invented backpressure.

Every team has different needs and path dependencies, so settles on a different interpretation of CI/CD and software eng process. Productizing anything in this space is going to be an uphill battle to yank away teams' hard-earned processes.

Productizing process is hard but it's been done before! When paired with a LOT of spruiking it can really progress the field. It's how we got the first CI/CD tools (eg. https://en.wikipedia.org/wiki/CruiseControl) and testing libraries (eg. pytest)

So I wish you luck!

jcgrillo 2 days ago||

> For our team, every commit has an engineer's name attached to it, and that engineer ultimately needs to review and stand behind the code.

Then they claim (and demonstrate with a picture of a commits/day chart) a team-wide 10x throughput increase. I claim there's got to be a lot of rubber-stamp reviewing going on here. It may help to challenge the "author" to explain things like "why does this lifetime have the scope it does?" or "why did you factor it this way instead of some other way?" e.g. questions which force them to defend the "decisions" they made. I suspect if you're actually doing thorough reviews that the velocity will actually decrease instead of increase using LLMs.

More comments...