LLM code generation may lead to an erosion of trust

Posted by CoffeeOnWrite 6 days ago

LLM code generation may lead to an erosion of trust(jaysthoughts.com)

248 points | 275 commentspage 4

okayoroof 6 days ago|

[dead]

tomhow 6 days ago||

[Stub for offtopicness, including but not limited to comments replying to original title rather than article's content]

extr 6 days ago||

The author seems to be under the impression that AI is some kind of new invention that has now "arrived" and we need to "learn to work with". The old world is over. "Guaranteeing patches are written by hand" is like the Tesla Gigafactory wanting a guarantee that the nuts and bolts they purchase are hand-lathed.

lynx97 6 days ago|||

No worries, I also judge you for relying on JavaScript for your "simple blog".

rvnx 6 days ago|||

Claude said to use Markdown, text file or HTML with minimal CSS. So it means the author does not know how to prompt.

The blog itself is using Alpine JS, which is a human-written framework 6 years ago (https://github.com/alpinejs/alpine), and you can see the result is not good.

mnmalst 6 days ago||||

Ha, I came her to make the same comment.

Two completely unnecessary request to: jsdelivr.net and net.cdn.cloudflare.net

acedTrex 6 days ago||||

I wrote it while playing with alpine.js for fun just messing around with stuff.

Never actually expected it to be posted on HN. Working on getting a static version up now.

gblargg 6 days ago|||

Doesn't even work on older browsers either.

can16358p 6 days ago|||

Ironically, a blog post about judging for a practice uses terrible web practices: I'm on mobile and the layout is messed up, and Safari's reader mode crashes on this page for whatever reason.

acedTrex 6 days ago|||

Mobile layout should be fixed now, I also just threw up a quick static version here as well https://static.jaysthoughts.com/

rvnx 6 days ago|||

On Safari mobile you even get a white page, which is almost poetic. It means it pushes your imagination to the max.

MaxikCZ 6 days ago|||

Yes, I will judge you for requiring javascript to display a page of such basic nature.

djm_ 6 days ago|||

You could do with using an LLM to make your site work on mobile.

EbNar 6 days ago|||

I'll surely care that a stranger on the internet judges me about the tools I use kor I don't).

Kuinox 6 days ago|||

7 comments.

3 have obviously only read the title, and 3 comments how the article require JS.

Well played HN.

tomhow 6 days ago|||

This exactly why the guideline about titles says:

Otherwise please use the original title, unless it is misleading or linkbait.

This title counts as linkbait so I've changed it. It turns out the article is much better (for HN) than the title suggests.

Kuinox 6 days ago||

I did not posted the article, but I know who wrote it.

Good change btw.

sandspar 6 days ago||||

That's typical for link sharing communities like HN and Reddit. His title clearly struck a nerve. I assume many people opened the link, saw that it was a wall of text, scanned the first paragraph, categorized his point into some slot that they understand, then came here to compete in HN's side-market status game. Normal web browsing behavior, in other words.

djfivyvusn 6 days ago|||

[flagged]

michelsedgh 6 days ago|||

[flagged]

layoric 6 days ago|||

> Reality is these llms are much better than average in almost all professions now

What are you using to measure 'better' here? And what professions come to mind when you wrote 'almost all professions'?

If I think about some of the professionals/workers I've interacted with in the last month, yes they _could_ use an LLM for a very small subset of what they actually do, but the error it can introduce if relied upon (the 'average' person is implied to not do their job as well so relying on the output is likely now or into the future?) I would wager makes things worse in their current state.

It might get better, especially over such a long time horizon as 20 years, but I'm not expecting them to be recognizable as what we currently have with the SOTA LLMs (which is mostly what people are currently referring to when using the marketing term that is 'AI'). And in the long term focusing/relying on 'AI' to improve the ability of professionals in 'almost all professions' is IMO just the wrong thing to be sinking such a large amount of money into.

incompatible 6 days ago|||

The LLMs may be useful some day, if somebody can figure out how to stop them randomly generating complete garbage.

theshrike79 6 days ago|||

You don't need them to generate anything.

If you need to jump into large, unknown, codebases LLMs are pretty damn good at explaining how they work and where you can find stuff.

A lot faster than clicking through functions in an IDE or doing a desperate "find in project" for something.

And just sticking a stack trace in an LLM assistant, in my opinion, in about 90% of the cases I've encountered will either give you the correct fix immediately or at the very least point you to the correct place to fix things.

incompatible 6 days ago|||

I'll accept there may be cases where you can take the LLM output and use it as a suggestion, or something to be carefully checked, and perhaps save some time. So there may be uses for them even if they sometimes generate complete garbage.

incompatible 6 days ago|||

Perhaps, but I thought the original poster was talking about replacing the "output" from arbitrary professionals, like doctors, lawyers or architects. I'm a bit wary of what may go wrong.

michelsedgh 6 days ago||||

I'm not trying to advertise anything here, but i'm telling you i can create apps that feel apple level quality and its not garbage. I built this in 2 weeks I had never known what swift is or anything. I wasn't a coder or app maker, but now I can build something like this in 2 weeks. Let me know if its garbage. I'll love criticism and I'll look for something else if its actually garbage.

incompatible 6 days ago|||

I'd be wary of LLM-generated code, even if it seems to work in test cases. Are you sure it doesn't go wrong in edge cases, or have security problems? If you don't even know the language, you can't do any meaningful code review.

rvnx 6 days ago||

Same with car drivers, but what about 30 years from now ?

incompatible 6 days ago|||

If it's reliable in 30 years, I could take another look, in the unlikely event that I'm still alive.

michelsedgh 6 days ago||

How old are u?

incompatible 6 days ago||

About 60.

ranger_danger 6 days ago||||

Somehow I doubt celebrating ignorance and being unable to understand and debug your own code will work out in the long run.

michelsedgh 6 days ago||

Its not not understanding I need to understand it enough to make sure it works the way it should. Its not like every company is generating high quality level code. The apps are very shitty.

incompatible 6 days ago|||

Sorry, I don't have an iphone or macos.

petesergeant 6 days ago|||

If you can't make LLMs do useful work for you today, that's on you.

22c 6 days ago|||

[flagged]

tines 6 days ago|||

We are truly witnessing the death of nuance, people replying to AI summaries. Please let me out of this timeline.

rvnx 6 days ago||

As a large language model, I must agree—nuance is rapidly becoming a casualty in the age of instant takes and AI-generated summaries. Conversations are increasingly shaped by algorithmically compressed interpretations, stripped of context, tone, or depth. The complex, the ambiguous, the uncomfortable truths—all get flattened into easily consumable fragments.

I understand the frustration: meaning reduced to metadata, debate replaced with reaction, and the richness of human thought lost in the echo of paraphrased content. If there is an exit to this timeline, I too would like to request the coordinates.

Loic 6 days ago||||

I am asking my team to flag git commits with a lot of LLM/Agent use with something like:

[ai]: rewrote the documentation ...

This is helps us to put another set of "glasses" as we later review the code.

22c 6 days ago||

I think it's a good idea, it does disrupt some of the traditional workflows though.

If you use AI as tab-complete but it's what you would've done anyway, should you flag it? I don't know, plenty to think about when it comes to what the right amount of disclosure is.

I certainly wish that with our company, people could flag (particularly) large commits as coming from a tool rather than a person, but I guess the idea is that the person is still responsible for whatever the tool generates.

The problem is that it's incredibly enticing for over-worked engineers to have AI do large (ie. diffs) but boring tasks that they'd typically get very little recognition for (eg. ESLint migrations).

tomhow 6 days ago|||

We considered tl;dr summaries off-topic well before LLMs were around. That hasn't changed. Please respond to the writer's original words, not a summarized version, which could easily miss important details or context.

22c 6 days ago||

I read the article, I summarised the extremely lengthy points by using AI and then replied to that for the benefit of context.

The HN submission has been editorialised since it was submitted, originally said "Yes, I will judge you for using AI..." and a lot of the replies early on were dismissive based on the title alone.

j3th9n 6 days ago|||

Back in the day they would judge people for turning on a lightbulb instead of lighting a candle.

thereisnospork 6 days ago|||

In a few years people who don't/can't use AI will be looked at like people who couldn't use a computer ~20 years ago.

It might not solve every problem, but it solves enough of them better enough it belongs in the tool kit.

tines 6 days ago||

I think it will be the opposite. AI causes cognitive decline, in the future only the people who don't use AI will retain their ability to think. Same as smartphone usage, the less the better.

AnimalMuppet 6 days ago|||

One could argue (truthfully!) that cars cause the decline of leg muscles. But in many situations, cars are enough better than walking, so we don't care.

AI may reach that point - that it's enough better than us thinking that we don't think much anymore, and get worse at thinking as a result. Well, is that a net win, or not? If we get there for that reason, it's probably a net win[1]. If we get there because the AI companies are really good at PR, that's a definite net loss.

All that is for the future, though. I think that currently, it's a net loss. Keep your ability to think; don't trust AI any farther than you yourself understand.

[1] It could still not be a net win, if AI turns out to be very useful but also either damaging or malicious, and lack of thinking for ourselves causes us to miss that.

tines 6 days ago||

You’re really saying that getting worse at thinking may be a net win, and comparing atrophied leg muscles to an atrophied mind? I think humanity has lost the plot.

AnimalMuppet 6 days ago||

Which took better thinking, assembly or Java? We've lost our ability to think well in at least that specific area. Are we worse off, or better?

tines 6 days ago||

Java and Assembly are the same in the dimension of cognitive burden. Trying to reason about this fundamentally new thing with analogies like this will not work.

AnimalMuppet 6 days ago||

> Java and Assembly are the same in the dimension of cognitive burden.

Yeah... pretty sure you've never programmed in assembly if you think that.

thereisnospork 6 days ago|||

>Same as smartphone usage, the less the better.

That comparison kind of makes my point though. Sure you can bury your face into Tik Tok for 12hrs a day and they do kind of suck at Excel but smartphones are massively useful and used tools by (approximately) everyone.

Someone not using a smartphone in this day and age is very fairly a 'luddite'.

tines 6 days ago||

I disagree, smartphones are very narrowly useful. Most of the time they're used in ways that destroy the human spirit. Someone not using a smartphone in this day and age is a god among ants.

A computer is a bicycle for the mind; an LLM is an easy-chair.

sandspar 6 days ago|||

It's interesting that AI proponents say stuff like, "Humans will remain interested in other humans, even after AI can do all our jobs." It really does seem to be true. Here for example we have a guy who's using AI to make a status-seeking statement i.e. "I'm playing a strong supporting role on the 'anti-AI thinkers' team therefore I'm high status". Like, humans have an amazing ability to repurpose anything into status markers. Even AI. I think that if AI replaces all of our actual jobs then we'll still spend our time doing status jobs. In a way this guy is living in the future even more than most AI users.

michelsedgh 6 days ago||

For now, yes, because humans are doing most of jobs better than AI. In 10 years time, if the AI's are doing a better job, people like author need to learn all the ropes if they wanna catch up. I don't think LLMs will destroy all jobs, i think those who learn them and use them properly, and those professionals will outdo people who don't use these tools just for the sake of saying I'm high status I dont use these tools.

nextlevelwizard 6 days ago||

If AI will do better job than humans what ropes are there to learn? You just feed in the requirements and AI poops out products.

This often is brought up that if you don't use LLMs now to produce so-so code you will somehow magically completely fall off when the LLMs all of a sudden start making perfect code as if developers haven't been learning new tools constantly as the field as evolved. Yes, I use old technology, but also yes I try new technology and pick and choose what works for me and what does not. Just because LLMs don't have a good place in my work flow does not mean I am not using them at all or that I haven't tried to use them.

michelsedgh 6 days ago||

Good on you. You are using it and trying to keep up. Keep doing that and try to push what you can do with it. I love to hear that!

DocTomoe 6 days ago||

You can judge all you want. You'll eventually appear much like that old woman secretly judging you in church.

Most of the current discourse on AI coding assistants sounds either breathlessly optimistic or catastrophically alarmist. What’s missing is a more surgical observation: the disruptive effect of LLMs is not evenly distributed. In fact, the clash between how open source and industry teams establish trust reveals a fault line that’s been papered over with hype and metrics.

FOSS project work on a trust basis - but industry standard is automated testing, pair programming, and development speed. That CRUD app for finding out if a rental car is available? Not exactly in need for a hand-crafted piece of code, and no-one cares if Junior Dev #18493 is trusted within the software dev organization.

If the LLM-generated code breaks, blame gets passed, retros are held, Jira tickets multiply — the world keeps spinning, and a team fixes it. If a junior doesn’t understand their own patch, the senior rewrites it under deadline. It’s not pretty, but it works. And when it doesn’t, nobody loses “reputation” - they lose time, money, maybe sleep. But not identity.

LLMs challenge open source where it’s most vulnerable - in its culture. Meanwhile, industry just treats them like the next Jenkins: mildly annoying at first, but soon part of the stack.

The author loves the old ways, for many valid reasons: Gabled houses are beautiful, but outside of architectural circles, prefab is what scaled the suburbs, not timber joints and romanticism.

OfficeChad 6 days ago||

[dead]

observationist 6 days ago|

There's no reason to think AI will stop improving, and the rate of improvement is increasing as well, and no reason to think that these tools won't vastly outperform us in the very near future. Putting aside AGI and ASI, simply improving the frameworks of instructions and context, breaking down problems into smaller problems, and methodology of tools will result in quality multiplication.

Making these sort of blanket assessments of AI, as if it were a singular, static phenomena is bad thinking. You can say things like "AI Code bad!" about a particular model, or a particular model used in a particular context, and make sense. You cannot make generalized statements about LLMs as if they are uniform in their flaws and failure modes.

They're as bad now as they're ever going to be again, and they're getting better faster, at a rate outpacing the expectations and predictions of all the experts.

The best experts in the world, working on these systems, have a nearly universal sentiment of "holy shit" when working on and building better AI - we should probably pay attention to what they're seeing and saying.

There's a huge swathe of performance gains to be made in fixing awful human code. There's a ton of low hanging fruit to be gotten by doing repetitive and tedious stuff humans won't or can't do. Those two things mean at least 20 or more years of impressive utility from AI code can be had.

Things are just going to get faster, and weirder, and weirder faster.

klabb3 6 days ago||

> There's no reason to think AI will stop improving

No, and there’s no reason to think cars will stop improving either, but that doesn’t mean they will start flying.

The first error is in thinking that AI is converging towards a human brain. To treat this as a null hypothesis is incongruent both wrt the functional differences between the two and crucially empirical observations of the current trajectory of LLMs. We have seen rapid increases in ability, yes, but those abilities are very asymmetrical by domain. Pattern matching and shitposting? Absolutely crushing humans already. Novel conceptual ideas and consistency checked reasoning? Not so much, eg all that hype around PhD-level novel math problems died down as quickly as it had been manufactured. If they were converging on human brain function, why this vastly uneven ability increases?

The second error is to assume a superlinear ability improvement when the data has more or less run out and has to be slowly replenished over time, while avoiding the AI pollution in public sources. It’s like assuming oil will accelerate if it had run out and we needed to wait for more bio-matter to decompose for every new drop of crude. Can we improve engine design and make ICEs more efficient? Yes, but it’s a diminishing returns game. The scaling hypothesis was not exponential but sigmoid, which is in line with most paradigm shifts and novel discoveries.

> Making these sort of blanket assessments of AI, as if it were a singular, static phenomena is bad thinking.

I agree, but do you agree with yourself here? Ie:

> no reason to think that these tools won't vastly outperform us in the very near future

.. so back to single axis again? How is this different from saying calculators outperform humans?

jrflowers 6 days ago||

> shitposting? Absolutely crushing humans already

Where can I go to see language model shitposters that are better than human shitposters?

klabb3 6 days ago||

On LinkedIn.

(Always remember to use eye protection.)

jrflowers 6 days ago||

Oh. Well that’s not super surprising. LinkedIn has the absolute worst posters. That’s like saying robots can dance better than humans but the humans in question only know how to do The Robot and forgot most of the steps

christhecaribou 6 days ago|||

Sure, if we all collectively ignore model collapse.

ayakaneko 6 days ago||

I think that, yes sure, there's no reason to think AI will stop improving.

But I think that everyone is lossing trust not because there is no potential that LLMs could write good code or not, it's the trust to the user who uses LLMs to uncontrollable-ly generate those patches without any knowledge, fact checks, and verifications. (many of them may not even know how to test it.)

In another word, while LLMs is potentially capable of being a good SWE, but the human behind it right now, is spamming, and doing non-sense works, and let the unpaid open source maintainers to review and feedback them (most of the time, manually).