Posted by CoffeeOnWrite 6 days ago
The blog itself is using Alpine JS, which is a human-written framework 6 years ago (https://github.com/alpinejs/alpine), and you can see the result is not good.
Two completely unnecessary request to: jsdelivr.net and net.cdn.cloudflare.net
Never actually expected it to be posted on HN. Working on getting a static version up now.
3 have obviously only read the title, and 3 comments how the article require JS.
Well played HN.
Otherwise please use the original title, unless it is misleading or linkbait.
This title counts as linkbait so I've changed it. It turns out the article is much better (for HN) than the title suggests.
Good change btw.
What are you using to measure 'better' here? And what professions come to mind when you wrote 'almost all professions'?
If I think about some of the professionals/workers I've interacted with in the last month, yes they _could_ use an LLM for a very small subset of what they actually do, but the error it can introduce if relied upon (the 'average' person is implied to not do their job as well so relying on the output is likely now or into the future?) I would wager makes things worse in their current state.
It might get better, especially over such a long time horizon as 20 years, but I'm not expecting them to be recognizable as what we currently have with the SOTA LLMs (which is mostly what people are currently referring to when using the marketing term that is 'AI'). And in the long term focusing/relying on 'AI' to improve the ability of professionals in 'almost all professions' is IMO just the wrong thing to be sinking such a large amount of money into.
If you need to jump into large, unknown, codebases LLMs are pretty damn good at explaining how they work and where you can find stuff.
A lot faster than clicking through functions in an IDE or doing a desperate "find in project" for something.
And just sticking a stack trace in an LLM assistant, in my opinion, in about 90% of the cases I've encountered will either give you the correct fix immediately or at the very least point you to the correct place to fix things.
I understand the frustration: meaning reduced to metadata, debate replaced with reaction, and the richness of human thought lost in the echo of paraphrased content. If there is an exit to this timeline, I too would like to request the coordinates.
[ai]: rewrote the documentation ...
This is helps us to put another set of "glasses" as we later review the code.
If you use AI as tab-complete but it's what you would've done anyway, should you flag it? I don't know, plenty to think about when it comes to what the right amount of disclosure is.
I certainly wish that with our company, people could flag (particularly) large commits as coming from a tool rather than a person, but I guess the idea is that the person is still responsible for whatever the tool generates.
The problem is that it's incredibly enticing for over-worked engineers to have AI do large (ie. diffs) but boring tasks that they'd typically get very little recognition for (eg. ESLint migrations).
The HN submission has been editorialised since it was submitted, originally said "Yes, I will judge you for using AI..." and a lot of the replies early on were dismissive based on the title alone.
It might not solve every problem, but it solves enough of them better enough it belongs in the tool kit.
AI may reach that point - that it's enough better than us thinking that we don't think much anymore, and get worse at thinking as a result. Well, is that a net win, or not? If we get there for that reason, it's probably a net win[1]. If we get there because the AI companies are really good at PR, that's a definite net loss.
All that is for the future, though. I think that currently, it's a net loss. Keep your ability to think; don't trust AI any farther than you yourself understand.
[1] It could still not be a net win, if AI turns out to be very useful but also either damaging or malicious, and lack of thinking for ourselves causes us to miss that.
Yeah... pretty sure you've never programmed in assembly if you think that.
That comparison kind of makes my point though. Sure you can bury your face into Tik Tok for 12hrs a day and they do kind of suck at Excel but smartphones are massively useful and used tools by (approximately) everyone.
Someone not using a smartphone in this day and age is very fairly a 'luddite'.
A computer is a bicycle for the mind; an LLM is an easy-chair.
This often is brought up that if you don't use LLMs now to produce so-so code you will somehow magically completely fall off when the LLMs all of a sudden start making perfect code as if developers haven't been learning new tools constantly as the field as evolved. Yes, I use old technology, but also yes I try new technology and pick and choose what works for me and what does not. Just because LLMs don't have a good place in my work flow does not mean I am not using them at all or that I haven't tried to use them.
Most of the current discourse on AI coding assistants sounds either breathlessly optimistic or catastrophically alarmist. What’s missing is a more surgical observation: the disruptive effect of LLMs is not evenly distributed. In fact, the clash between how open source and industry teams establish trust reveals a fault line that’s been papered over with hype and metrics.
FOSS project work on a trust basis - but industry standard is automated testing, pair programming, and development speed. That CRUD app for finding out if a rental car is available? Not exactly in need for a hand-crafted piece of code, and no-one cares if Junior Dev #18493 is trusted within the software dev organization.
If the LLM-generated code breaks, blame gets passed, retros are held, Jira tickets multiply — the world keeps spinning, and a team fixes it. If a junior doesn’t understand their own patch, the senior rewrites it under deadline. It’s not pretty, but it works. And when it doesn’t, nobody loses “reputation” - they lose time, money, maybe sleep. But not identity.
LLMs challenge open source where it’s most vulnerable - in its culture. Meanwhile, industry just treats them like the next Jenkins: mildly annoying at first, but soon part of the stack.
The author loves the old ways, for many valid reasons: Gabled houses are beautiful, but outside of architectural circles, prefab is what scaled the suburbs, not timber joints and romanticism.
Making these sort of blanket assessments of AI, as if it were a singular, static phenomena is bad thinking. You can say things like "AI Code bad!" about a particular model, or a particular model used in a particular context, and make sense. You cannot make generalized statements about LLMs as if they are uniform in their flaws and failure modes.
They're as bad now as they're ever going to be again, and they're getting better faster, at a rate outpacing the expectations and predictions of all the experts.
The best experts in the world, working on these systems, have a nearly universal sentiment of "holy shit" when working on and building better AI - we should probably pay attention to what they're seeing and saying.
There's a huge swathe of performance gains to be made in fixing awful human code. There's a ton of low hanging fruit to be gotten by doing repetitive and tedious stuff humans won't or can't do. Those two things mean at least 20 or more years of impressive utility from AI code can be had.
Things are just going to get faster, and weirder, and weirder faster.
No, and there’s no reason to think cars will stop improving either, but that doesn’t mean they will start flying.
The first error is in thinking that AI is converging towards a human brain. To treat this as a null hypothesis is incongruent both wrt the functional differences between the two and crucially empirical observations of the current trajectory of LLMs. We have seen rapid increases in ability, yes, but those abilities are very asymmetrical by domain. Pattern matching and shitposting? Absolutely crushing humans already. Novel conceptual ideas and consistency checked reasoning? Not so much, eg all that hype around PhD-level novel math problems died down as quickly as it had been manufactured. If they were converging on human brain function, why this vastly uneven ability increases?
The second error is to assume a superlinear ability improvement when the data has more or less run out and has to be slowly replenished over time, while avoiding the AI pollution in public sources. It’s like assuming oil will accelerate if it had run out and we needed to wait for more bio-matter to decompose for every new drop of crude. Can we improve engine design and make ICEs more efficient? Yes, but it’s a diminishing returns game. The scaling hypothesis was not exponential but sigmoid, which is in line with most paradigm shifts and novel discoveries.
> Making these sort of blanket assessments of AI, as if it were a singular, static phenomena is bad thinking.
I agree, but do you agree with yourself here? Ie:
> no reason to think that these tools won't vastly outperform us in the very near future
.. so back to single axis again? How is this different from saying calculators outperform humans?
Where can I go to see language model shitposters that are better than human shitposters?
(Always remember to use eye protection.)
But I think that everyone is lossing trust not because there is no potential that LLMs could write good code or not, it's the trust to the user who uses LLMs to uncontrollable-ly generate those patches without any knowledge, fact checks, and verifications. (many of them may not even know how to test it.)
In another word, while LLMs is potentially capable of being a good SWE, but the human behind it right now, is spamming, and doing non-sense works, and let the unpaid open source maintainers to review and feedback them (most of the time, manually).