Posted by simonpure 4 days ago
Wouldn't this work be extremely easy to implement with an LLM coder?
Claude Code doesn't need to be interested to work.
In another comment here I explained that I have run a test: asking Claude Code to add a substantial feature to 270 different C programs.
Despite your beliefs - it went extremely well.
But xscreensaver theme tweaks for personal use have a much lower standard for quality control, regression testing, side effects, etc than a kernel used by billions of devices with thousands of interconnected drivers and subsystems.
Not to mention the coordination problem to get every maintainer on board and patches approved for each specific area when working on a project of that scale, even for a relatively narrow change.
Claude Code doesn't really help with that so don't see why the expectation would be a significant speed up (and doing it all in a single patch would definitely be rejected).
I refuse to believe the six year delay here was getting people to test a patch.
Which, actually, Claude Code will also do quite well.
Not that the Linux kernel approval procedures couldn't be streamlined, work couldn't be parallelized, or anything else like that, which would be a different discussion entirely.
You stated that Claude Code could have significantly sped up the process, so the burden of evidence here should be on how specifically these patches would have benefited/time saved from using LLMs. Hand wavingly saying "LLMs = faster" is too vague/broad of a claim without providing any evidence (and also unfalsifiable).
And what I'm saying is I refuse to believe the Linux kernel approval procedures are that inefficient. Therefore, your belief "bottleneck was most likely not mechanical code changes" is most likely incorrect.
It would be interesting to get the actual answer to this question.
EDIT: Substantially changing your argument after posting isn't nice. But to answer your charge - no - I never made that claim.
That's a different scenario, though.
Would Claude have performed adequately if it had to add a specific feature to 270 programs buried in a set of 270m program, each of which may or may not have a dependency on one or more of the others, with virtually unbounded results to test?
In terms of tokens alone, that would have been cost-prohibitive. But lets assume that you had the money to do this: it still might not even be possible.
You're confusing "I have these 270 independent programs and want to make this change to all of them" with "I have these 270m lines of code, of which only 270 needs to be changed".
Let's see if they'll let this account through.
You can find the "strncpy"s with grep, but you cannot find all the downstream effect of those changes, especially if something downstream is relying on the broken behaviour!
I took the 10 most difficult patches from the git history - the ones that took the most back-and-forth to fix. I asked Claude to write them. Would you like to see the work?
If you believe a human performs better at finding downstream effects - you need to prove that. I see no reason why it should be true.
Once gain, you are not reading what is being said - no one made that claim!
No claim was made in fact: it was a refutation. Specifically, the refutation is "this is why it took so many years".
You did not literally make that claim but your cost argument hinges on it.
Without it, then Claude does about the same as a human and only costs $100.
Apparently I'm reading your comments more thoroughly than you are.
What happens if you turn a job like that over to Claude Code? A mess? Good results? Code bloat? Worth trying on existing C programs.
It mostly did an amazing job in a short period of time.
EDIT: Of course I get downvoted for saying this. HN isn't interested in reality any more.
I suspect that rather many of us are simply just tired of Claude and friends getting shoehorned into any conversation about programming at this point. It is about as fun as the Rust Brigade entering any discussion about C. It adds nothing new to the discussion and it is frankly tiring since we pretty much at any time have a handful of conversations on the front page already covering "AI" topics anyway (counting four at the time of writing this).
I thought automation would be interesting to HN - given the context and the fact it was not used.
Pretty sure that's exactly what LLMs in coding harnesses are.
Even when they implement LLMs.
I know ChatGPT seems like it's non-deterministic - but that just a user preference.
The only real issue is if you have many people on the same server, the GPU contention can be non-deterministic in rare cases.
LLMs can be deterministic if you need them to be - it's just that most people prefer the human-like interface.
unfortunately as time goes by, the linux api surface gets larger and more convoluted. so there's going to be some coverage you're just never going to get.
but in the abstract, definitely. linux is so bloated at this point that its not clear that it can ever be 'made safe'.