LLM code generation may lead to an erosion of trust

Posted by CoffeeOnWrite 6 days ago

LLM code generation may lead to an erosion of trust(jaysthoughts.com)

248 points | 275 commentspage 3

mizzao 5 days ago|

The last section of this post seems to be quite predictive of a sibling post on the front page right now: https://news.ycombinator.com/item?id=44382752

wg0 5 days ago||

We have seen those 10x engineers churning out PRs and huge PRs before anyone can fathom and make sense of the whole damn thing.

Wondering what they would be producing with LLMs?

I_Lorem 5 days ago||

He's making a good point on trust, but, really, doesn't the trust flow both directions? Should the Sr. Engineer rubber stamp or just take a quick glance at Bob's implementation because he's earned his chops, or should the Sr. Engineer apply the same level of review regardless of whether it's Bob, Mary, or Rando Calrissian submitting their work for review?

eikenberry 5 days ago|

The Sr. Engineer should definitely give (presumably another Sr. Eng.) Bob's code a quicky review and approve it. If Mary or Rando are Sr. then they should get the same level as well. If anyone is a Jr. they should get a much more in-depth review as it's a teaching opportunity, whereas Sr. on Sr. reviews are done to enforce conventions and to be sure the PR has an audience (people take more care when they know other people will look at it).

fhd2 5 days ago||

A bit tangential, but I noticed quite a discrepancy between augmented coding done well, and augmented coding how I actually see it done in the wild.

There's a lot of posts about how to do it well, and I like the idea of it, generally. I think GenAI has genuine applications in software development beyond as a Google/SO replacement.

But then there's real world code. I constantly see:

1. Over engineering. People used to keep it simple because they were limited by how fast they can type. Well, those gloves sure did come off for a lot of developers.

2. Lack of understanding / memory. If I ask someone about how their code works, if they didn't write it (or at least carefully analyse it), it's rare for them to understand or even remember what they did there. The common answer to "how does this work?", went from "I think like this but let me double check" to "no idea". Some will be proud to tell you they auto generated documentation, too. If you have any questions about that, chances are you'll get another "no idea" response. If you ask an LLM how it works, that's very hit and miss for non-trivial systems. I always tell my devs I hire them to understand systems first and formost, building systems comes second. I feel increasingly alone with that attitude.

3. Bugs. So many bugs. It seems devs that generate code would need to do a lot more explicit testing than those who don't. There's probably just a missing feedback loop: When typing in code, you tend to have to test every little button action and so on at least once, it's just part of the work. Chances are you don't break it since you last tested it, so while this happens, manually written code generally has one time exhaustive manual testing built into the process naturally. If you generate a whole UI area, you need to do thorough testing of all kinds of conditions. Seems people don't.

So while it could be great, from my perspective, it feels like more of a net negative in practice. It's all fun and games until there's a problem. And there always is.

Maybe I have a bad sample of the industry. We essentially specialise on taking over technically disastrous projects and other kinds of tricky situations. Few people hire us to work on a good system with a strong team behind it.

But still, comparing the questionable code bases I got into two years ago with those I get into now, there is a pretty clear change for the worse.

Maybe I'm pessimistic, but I'm starting to think we'll need another software crisis (and perhaps a wee AI winter) to get our act together with this new technology. I hope I'm wrong.

helge9210 5 days ago||

I checked with HR at my company and got an answer I'm not allowed to announce the following: anyone submitting the code or asking a question about the code without disclosing the fact that the code in question was generated by LLM would be cursed.

throwawayoldie 5 days ago||

IMHO, s/may/has/

atemerev 6 days ago||

I am a software engineer who writes 80-90% code with AI (sorry, can't ignore the productivity boost), and I mostly agree with this sentiment.

I found out very early that under no circumstances you may have the code you don't understand, anywhere. Well, you may, but not in public, and you should commit to understanding it before anyone else sees that. Particularly before sales guys do.

However, AI can help you with learning too. You can run experiments, test hypotheses and burn your fingers so fast. I like it.

benreesman 5 days ago||

I'm currently standing up a C++ capability in an org that hasn't historically had one, so things like the style guide and examples folder require a lot of care to give a good start for new contributors.

I have instructions for agents that are different in some details of convention, e.g. human contributors use AAA allocation style, agents are instructed to use type first. I convert code that "graduates" from agent product to review-ready as I review agent output, which keeps me honest that I don't myself submit code without scrutiny to the review of other humans: they are able to prompt an LLM without my involvement, and I'm able to ship LLM slop without making a demand on their time. Its an honor system, but a useful one if everyone acts in good faith.

I get use from the agents, but I almost always make changes and reconcile contradictions.

pfdietz 6 days ago||

There was trust?

thedudeabides5 5 days ago|

dont.trust.machines

More comments...