Posted by keybits 1 day ago
I really liked the paragraph about LLMs being "alien intelligence"
> Many engineers I know fall into 2 camps, either the camp that find the new class of LLMs intelligent, groundbreaking and shockingly good. In the other camp are engineers that think of all LLM generated content as “the emperor’s new clothes”, the code they generate is “naked”, fundamentally flawed and poison.
I like to think of the new systems as neither. I like to think about the new class of intelligence as “Alien Intelligence”. It is both shockingly good and shockingly terrible at the exact same time.
Framing LLMs as “Super competent interns” or some other type of human analogy is incorrect. These systems are aliens and the sooner we accept this the sooner we will be able to navigate the complexity that injecting alien intelligence into our engineering process leads to.
It's a similitude I find compelling. The way they produce code and the way you have to interact with them really feels "alien", and when you start humanizing them, you get emotions when interacting with it and that's not correct.
I mean, I do get emotional and frustrated even when good old deterministic programs misbehaved and there was some bug to find and squash or work-around, but the LLM interactions can bring the game to a complete new level. So, we need to remember they are "alien".If we agree that we are all humans and assume that all the other humans are conscious as one is, I think we can extrapolate that there is generic "human intelligence" concept. Even if it's pretty hard do nail it down, and even if there are several definitions of human intelligence out there.
For the other part of the comment, not too familiar with Discourse opensource approach but I guess that those rules are there mainly for employees, but since they develop in the open and public, they make them public as well.
These new submarines are a lot closer to human swimming than the old ones were, but they’re still very different.
I've found myself wanting line-level blame for LLMs. If my teammate committed something that was written directly by Claude Code, it's more useful to me to know that than to have the blame assigned to the human through the squash+merge PR process.
Ultimately somebody needs to be on the hook. But if my teammate doesn't understand it any better than I do, I'd rather that be explicit and avoid the dance of "you committed it, therefore you own it," which is better in principle than in practice IMO.
Will the contributor respond to code-review feedback? Will they follow-up on work? Will they work within the code-of-conduct and learn the contributor guidelines? All great things to figure out on small bugs, rather than after the contributor has done significant feature work.
There are plenty of open source projects where it is difficult to get up to speed with the intricacies of the architecture that limits the ability of talented coders to contribute on a small scale.
There might be merit in having a channel for AI contributions that casual helpers can assess to see if they pass a minimum threshold before passing on to a project maintainer to assess how the change works within the context of the overall architecture.
It would also be fascinating to see how good an AI would be at assessing the quality of a set of AI generated changes absent the instructions that generated them. They may not be able to clearly identify whether the change would work, but can they at least rank a collection of submissions to select the ones most worth looking at?
At the very least the pile of PRs count as data of things that people wanted to do, even if the code was completely unusable, placing it into a pile somewhere might be minable for the intentions of erstwhile contributors.
First of all, if you want innovation, why are you forcing it into a single week? You very likely have smart people with very good ideas, but they’re held back by your number-driven bullshit. These orgs actively kill innovation by reducing talent to quantifiable rows of data.
A product hobbled together from shit prototype code very obviously stands out. It has various pages that don’t quite look/work the same, Cross-functional things that “work everywhere else” don’t in some parts.
It rewards only the people who make good presentations, or pick the “current hype thing” to work on. Occasionally something good that addresses real problems is at least mentioned but the hype thing will always win (if judged by your SLT)
Shame on you if the slop prototype is handed off to some other team than the hackathon presenters. Presenters take all the promotion points, then implementers have to sort out a bunch of bullshit code, very likely being told to just ship the prototype “it works you idiots, I saw it in the demo, just ship it.” Which is so incredibly short sighted.
I think the depressing truth is your executives know it’s all hobbled together bullshit, but that it will sell anyway, so why invest time making it actually good? They all have their golden parachutes, what do they care about the suckers stuck on-call for the house-of-cards they were forced to build, despite possessing the talent to make it stable? All this stupidity happens over and over again, not because it is wise, or even the best way to do this, the truth is just a flaccid “eh, it’ll work though, fuck it, let’s get paid.”
We have to do better than that before congratulating ourselves about all the wonderful "innovation".
>This feels extremely counterproductive and fundamentally unenforceable to me. Much of the code AI generates is indistinguishable from human code anyway. You can usually tell a prototype that is pretending to be a human PR, but a real PR a human makes with AI assistance can be indistinguishable.
Isn't that exactly the point? Doesn't this achieve exactly what the whole article is arguing for?
A hard "No AI" rule filters out all the slop, and all the actually good stuff (which may or may not have been made with AI) makes it in.
When the AI assisted code is indistinguishable from human code, that's mission accomplished, yeah?
Although I can see two counterarguments. First, it might just be Covert Slop. Slop that goes under the radar.
And second, there might be a lot of baby thrown out with that bathwater. Stuff that was made in conjunction with AI, contains a lot of "obviously AI", but a human did indeed put in the work to review it.
I guess the problem is there's no way of knowing that? Is there a Proof of Work for code review? (And a proof of competence, to boot?)
And from the point of view of the maintainers, it seems a terrible idea to set up rules with the expectation that they will be broken.
Or, the decentralized, no rulers solution: clone the repo on your own website and put your patches there instead.
"Forced you to lie"?? Are you serious?
If the project says "no AI", and you insist on using AI, that's not "forcing you to lie"; that's you not respecting their rules and choosing to lie, rather than just go contribute to something else.
In a live setting, you could ask the submitter to explain various parts of the code. Async, that doesn’t work, because presumably someone who used AI without disclosing that would do the same for the explanation.
1. Someone raises a PR
2. Entry-level maintainers skim through it and either reject or pass higher up
3. If the PR has sufficient quality, the PR gets reviewed by someone who actually has merge permissions