Posted by bblcla 1/14/2026
I don't have the GPUs or time to experiment though :(
You could also use cooking as an analogy - trying to learn to cook by looking at pictures of cooked food rather than by having gone to culinary school and learnt the principles of how to actually plan and cook good food.
So, we're trying to train LLMs to code, by giving them "pictures" of code that someone else built, rather than by teaching them the principles involved in creating it, and then having them practice themselves.
LLMs definitely can create abstractions and boundaries. e.g. most will lean towards a pretty clean front end vs backend split even without hints. Or work out a data structure that fits the need. Or splits things into free standing modules. Or structure a plan into phases.
So this really just boils down to „good” abstractions which is subject to model improvement.
I really don’t see a durable moat for us meatbags in this line of reasoning
Coming up with a new design from scratch, designing (or understanding) a high level architecture based on some principled reasoning, rather than cargo cult coding by mimicking common patterns in the training data, is a different matter.
LLMs are getting much better at reasoning/planning (or at least something that looks like it), especially for programming & math, but this is still based on pre-training, mostly RL, and what they learn obviously depends on what they are trained on. If you wanted LLMs to learn principles of software architecture and abstraction/etc, then you would need to train on human (or synthetic) "reasoning traces" of how humans make those decisions, but it seems that currently RL-training for programming is mostly based on artifacts of reasoning (i.e. code), not the reasoning traces themselves that went into designing that code, so this (coding vs design reasoning) is what they learn.
I would guess that companies like Anthropic are trying to address this paucity of "reasoning traces" for program design, perhaps via synthetic data, since this is not something that occurs much in the wild, especially as you move up the scale of complexity from small problems (student assignments, stack overflow advice) to large systems (which are anyways mostly commercial, hence private). You can find a smallish number of open source large projects like gcc, linux, but what is missing are the reasoning traces of how the designers went from requirements to designing these systems the way they did (sometimes in questionable fashion!).
Humans of course learn software architecture in a much different way. As with anything, you can read any number of books, attend any number of lectures, on design principles and software patterns, but developing the skill for yourself requires hands-on personal practice. There is a fundamental difference between memory (of what you read/etc) and acquired skill, both in level of detail and fundamental nature (skills being based on action selection, not just declarative recall).
The way a human senior developer/systems architect acquires the skill of design is by practice, by a career of progressively more complex projects, successes and failures/difficulties, and learning from the process. By learning from your own experience you are of course privy to your own prior "reasoning traces" and will learn which of those lead to good or bad outcomes. Of course learning anything "on the job" requires continual learning, and things like curiosity and autonomy, which LLMs/AI do not yet have.
Yes, us senior meatbags, will eventually be having to compete with, or be displaced by, machines that are the equal of us (which is how I would define AGI), but we're not there yet, and I'd predict it's at least 10-20 years out, not least because it seems most of the AI companies are still LLM-pilled and are trying to cash in on the low-hanging fruit.
Software design and development is a strange endeavor since, as we have learnt, one of the big lessons of LLMs (in general, not just apropos coding), is how much of what we do is (trigger alert) parroting to one degree or another, rather than green field reasoning and exploration. At the same time, software development, as one gets away from boilerplate solutions to larger custom systems, is probably one of the more complex and reasoning-intensive things that humans do, and therefore may end up being one of the last, rather than first, to completely fall to AI. It may well be AI managers, not humans, who finally say that at last AI has reached human parity at software design, able to design systems of arbitrary complexity based on principled reasoning and accumulated experience.
It's largely the same patterns & principles applied in a tailored manner to the problem at hand, which LLMs can...with mixed success...do.
>human parity
Impact is not felt when the hardest part of the problem is cracked, but rather the easy parts.
If you have 100 humans making widgets and the AI can do 75% of the task then you've suddenly got 4 humans competing for every 1 remaining widget job. This is going to be lord of the flies in job market long before human parity.
>I'd predict it's at least 10-20 years out
For AGI probably, but I think by 2030 this will have hit society like a freight train...and hopefully we've figured out what we'll want to do about it by then too. UBI or whatever...because we'll have to.
True, but I didn't mean to focus on creativity, just the nature of what can be learned when all you have to learn from is artifacts of reasoning (code), not the underlying reasoning traces themselves (reasoning process for why the code was designed that way). Without reasoning traces you get what we have today where AI programming in the large comes down to cargo cult code pattern copying, without understanding whether the (unknown) design process that lead to the patterns being copied reasonably apply to the requirements/situation at hand.
So, it's not about novelty, but rather about having the reasoning traces (for large structured projects) available to learn when to apply design patterns that are already present in the training data - to select design patterns based on a semblance of principled reasoning (RL for reasoning traces), rather than based on cargo cult code smell.
> This is going to be lord of the flies in job market long before human parity.
Perhaps, and that may already be starting, but I think that until we get much closer to AGI you'll still need a human in the loop (both to interact with the AI, and to interact with the team/boss), with AI as a tool not a human replacement. So, the number of jobs may not decrease much, if at all. It's also possible that Jevons paradox applies and that the number of developer jobs actually increases.
It's also possible that human-replacement AGI is harder to achieve than widely thought. For example, maybe things like emotional intelligence and theory of mind are difficult to get right, and without it AI never quite cuts it as an fully autonomous entity that people want to deal with.
> UBI or whatever...because we'll have to.
Soylent Green ?
For example I could see some sort of self-play style RL working. Which architecture? Try them all in a sandbox and see. Humans need to trial & error learning as you say. So why not here too? Seems to have worked for alphago which arguably also contains components of abstract high level strategy.
>Jevons paradox
I can see it for tokens and possibly software too, but rather skeptical of it in job market context. It doesn't seem to have happened for the knowledge work AI already killed (e.g. translation or say copy writing). More (slop) stuff is being produced but it didn't translate into a hiring frenzy of copy writers. Possible that SWE is somehow different via network effects or something but I've not heard a strong argument for it yet.
>It's also possible that human-replacement AGI is harder to achieve than widely thought.
Yeah I think the current paradigm isn't gonna get us there at all. Even if you 10x GPT5 it still seems to miss some sort of spark that a 5 year old has but GPT doesn't. It can do PHD level work but qualitatively there is something missing there about that "intelligence".
Interesting times ahead for better or worse
https://docs.google.com/document/u/0/d/1zo_VkQGQSuBHCP45DfO7...
100% of code is made by Claude.
It is damn good at making "blocks".
However, Elixir seems to be a langage that works very well for LLM, cf. https://elixirforum.com/t/llm-coding-benchmark-by-language/7...
> Section 3.3:
> Besides, since we use the moderately capable DeepSeek-Coder-V2-Lite to filter simple problems, the Pass@1 scores of top models on popular languages are relatively low. However, these models perform significantly better on low-resource languages. This indicates that the performance gap between models of different sizes is more pronounced on low-resource languages, likely because DeepSeek-Coder-V2-Lite struggles to filter out simple problems in these scenarios due to its limited capability in handling low-resource languages.
It's also now a little bit old, as with every AI paper the second they are published, so I'd be curious to see a newer version.
But, I would agree in general that Elixir makes a lot of sense for agent-driven development. Hot code reloading and "let it crash" are useful traits in that regard, I think
The developers who aren't figuring out how to leverage AI tools and make them work for them are going to get left behind very quickly. Unless you're in the top tier of engineers, I'm not sure how one can blame the tools at this point.