Posted by swah 4 days ago
I'm more worried about the opposite: the next popular programming paradigm will be something that's hard to read for humans but not-so-hard for LLM. For example, English -> assembly.
Right now LLMs are taking languages meant for humans to understand better via abstraction, what if the next language is designed for optimal LLM/world model understanding?
Or instead of an entirely new language, theres some form of compiling/transpiling from the model language to a human centric one like WASM for LLMs
"I prompted it like this"
"I gave it the same prompt, and it came out different"
It's not programming. It might be having a pseudo-conversation with a complex system, but it's not programming.
Well I think the article would say that you can diff the documentation, and it's the documentation that is feeding the AI in this new paradigm (which isn't direct prompting).
If the definition of programming is "a process to create sets of instructions that tell a computer how to perform specific tasks" there is nothing in there that requires it to be deterministic at the definition level.
I wrote a program in C and and gave it to gcc. Then I gave the same program to clang and I got a different result.
I guess C code isn't programming.
This is a completely realistic scenario, given variance between compiler output based on optimization level, target architecture, and version.
Sure, LLMs are non-deterministic, but that doesn't matter if you never look at the code.
I can send specific LLM output to QA, I can’t ask QA to validate that this prompt will always produce bug free code even for future versions of the AI.
The output of the LLM is nondeterministic, meaning that the same input to the LLM will result in different output from the LLM.
That has nothing to do with weather the code itself is deterministic. If the LLM produces non-deterministic code, that's a bug, which hopefully will be caught by another sub-agent before production. But there's no reason to assume that programs created by LLMs are non-deterministic just because the LLMs themselves are. After all, humans are non-deterministic.
> I can send specific LLM output to QA, I can’t ask QA to validate that this prompt will always produce bug free code even for future versions of the AI.
This is a crazy scenario that does not correspond to how anyone uses LLMs.
That we agree it’s nonsense means we agree that using LLM prompts as a high level language is nonsense.
gcc and clang produce different assembly code, but it "does the same thing," for certain definitions of "same" and "thing."
Claude and Gemini produce different Rust code, but it "does the same thing," for certain definitions of "same" and "thing."
The issue is that the ultimate beneficiary of AI is the business owner. He's not a programmer, and he has a much looser definition of "same."
Claude and Gemini do not "do the same thing" in the same way in which Clang and GCC does the same thing with the same code (as long as certain axioms of the code holds).
The C Standard has been rigorously written to uphold certain principles such that the same code (following its axioms) will always produce the same results (under specified conditions) for any standard compliant compiler. There exists no such standard (and no axioms nor conditions to speak of) where the same is true of Claude and Gemini.
If you are interested, you can read the standard here (after purchasing access): https://www.iso.org/obp/ui/#iso:std:iso-iec:9899:ed-5:v1:en
True, but none of that is relevant to the non-programmer end user.
> You are relying on some odd definitions of "definitions", "equivalence", and "procedures"
These terms have rigorous definitions for programmers. The person making software in the future is a non-programmer and doesn't care about any of that. They care only that the LLM can produce what they asked for.
> The C Standard has been rigorously written to uphold certain principles
I know what a standard is. The point is that the standard is irrelevant if you never look at the code.
Your argument here (if I understand you correctly) is the same argument that to build a bridge you do not need to know all the laws of physics that prevents it from collapsing. The project manager of the construction team doesn’t need to know it, and certainly not the bicyclists who cross it. But the engineer who draws the blueprints needs to know it, and it matters that every detail on those blueprints are rigorously defined, such that the project manager of the construction team follows them to the letter. If the engineer does not know the laws of physics, or the project manager does not follow the blueprints to the letter, the bridge will likely collapse, killing the end user, that poor bicyclist.
I don't think I am. If you ask an LLM for a burger web site, you will get a burger web site. That's the only category that matters.
If one burger website generated uses PHP and the other is plain javascript, which completely changes the way the website has to be hosted--this category matters quite a bit, no?
No. Put yourself in the shoes of the owner of the burger restaurant (who only heard the term "JavaScript" twice in his life and vaguely remember it's probably something related to "Java", which he heard three times) and you'll know why the answer is no.
This is like saying it doesn't matter if your pipes are iron, lead or PVC because you don't see them. They all move water and shit where they need to be, so no problem. Ignorance is bliss I guess? Plumbers are obsolete!
It matters to you because you're a programmer, and you can't imagine how someone could create a program without being a programmer. But it doesn't really matter.
The non-technical user of the LLM won't care if the LLM generates PHP or JS code, because they don't care how it gets hosted. They'll tell the LLM to take care of it, and it will. Or more likely, the user won't even know what the word "hosting" means, they'll simply ask the LLM to make a website and publish it, and the LLM takes care of all the details.
Feels like the non-programmer is going to care a little bit about paying for 5 different hosting providers because the LLM decided to generate their burger website in PHP, JavaScript, Python, Ruby and Perl in successive iterations.
It's an implementation detail. The user doesn't care. OpenClaw can buy its own hosting if you ask it to.
> Feels like the non-programmer is going to care a little bit about paying for 5 different hosting providers because the LLM decided to generate their burger website in PHP, JavaScript, Python, Ruby and Perl in successive iterations.
There's this cool new program that the kids are using. It's called Docker. You should check it out.
How do you guarantee that the prompt "make me a burger website" results in a Docker container?
At this point, I think you are intentionally missing the point.
The non-programmer doesn't need to know about Docker, or race conditions, or memory leaks, or virtual functions. The programmer says "make me a web site" and the LLM figures it out. It will use an appropriate language and appropriate libraries. If appropriate, it will use Docker, and if not, it won't. If the non-programmer wants to change hosting, he can say so, and the LLM will change the hosting.
The level of abstraction goes up. The details that we've spent our lives thinking about are no longer relevant.
It's really not that complicated.
To maybe get out of this loop: your entire thesis is that nonfunctional requirements don't matter, which is a silly assertion. Anyone who has done any kind of software development work knows that nonfunctional requirements are important, which is why they exist in the first place.
My brother in Christ, please get off your condescending horse. I have written compilers. I know how they work. And also you've apparently never heard of undefined behavior.
The point is that the output is different at the assembly level, but that doesn't matter to the user. Just as output from an LLM but differ from another, but the user doesn't care.
Every day you say. I program every day, and I have never, in my 20 years of programming, on purpose written in undefined behavior. I think you may be exaggerating a bit here.
I mean, sure, some leet programmers do dabble in the undefined behavior, they may even rely on some compiler bug for some extreme edge case during code golf. Whatever. However it is not uncommon when enough programmers start relying on undefined behavior behaving in a certain way, that it later becomes a part of the standard and is therefor no longer “undefined behavior”.
Like I said in a different thread, I suspect you may be willfully ignorant about this. I suspect you actually know the difference between:
a) written instructions compiled into machine code for the machine to perform, and,
b) output of a statistical model, that may or may not include written instructions of (a).
There are a million reasons to claim (a) is not like (b), the fact that (a) is (mostly; or rather desirably) deterministic, while (b) is stochastic is only one (albeit a very good) reason.
Well, you sound like an ignorant troll who came here to insult people and start fights. Which also happens a lot on the internet.
Take your abrasive ego somewhere else. HN is not for you.
If I know the system I'm designing and I'm steering, isn't it the same?
We're not punching cards anymore, yet we're still telling the machines what to do.
Regardless, the only thing that matters is to create value.
Functions like:
updatesUsername(string) returns result
...can be turned into generic functional euphemism
takeStringRtnBool(string) returns bool
...same thing. context can be established by the data passed in, external system interactions (updates user values, inventory of widgets)
as workers SWEs are just obfuscating how repetitive their effort is to people who don't know better
the era of pure data driven systems is arrived. in-line with the push to dump OOP we're dumping irrelevant context in the code altogether: https://en.wikipedia.org/wiki/Data-driven_programming
Lots of horrifying things are inevitable because they represent "progress" (where "progress" means "good for the market", even if it's bad for the idea of civilization), and we, as a society, come to adapt to them, not because they are good, but because they are.
>"I gave it the same prompt, and it came out different"
1:1 reproducibility is much easier in LLMs than in software building pipelines. It's just not guaranteed by major providers because it makes batching less efficient.
What’s a ‘software building pipeline’ in your view here? I can’t think of parts of the usual SDLC that are less reproducible than LLMs, could you elaborate?
Input-to-output reproducibility in LLMs (assuming the same model snapshot) is a matter of optimizing the inference for it and fixing the seed, which is vastly simpler. Google for example serves their models in an "almost" reproducible way, with the difference between runs most likely attributed to batching.
If you are using an LLM as a high level language, that means that every time you make a slight change to anything and “recompile” all of the thousands upon thousands of unspecified implementation details are free to change.
You could try to ameliorate this by training LLMs to favor making fewer changes, but that would likely end up encoding every bad architecture decisions made along the way and essentially forcing a convergence on bad design.
Fixing this I think requires judgment on a level far beyond what LLMs have currently demonstrated.
I can write a spec for an entirely new endpoint, and Claude figures out all of the middleware plumbing and the database queries. (The catch: this is in Rust and the SQL is raw, without an ORM. It just gets it. I'm reviewing the code, too, and it's mostly excellent.)
I can ask Claude to add new data to the return payloads - it does it, and it can figure out the cache invalidation.
These models are blowing my mind. It's like I have an army of juniors I can actually trust.
In my experience, agentic LLMs tend to write code that is very branchy with cyclomatic complexity. They don't follow DRY principles unless you push them very hard in that direction (and even then not always), and sometimes they do things that just fly in the face of common sense. Example of that last part: I was writing some Ruby tests with Opus 4.6 yesterday, and I got dozens of tests that amounted to this:
x = X.new
assert x.kind_of?(X)
This is of course an entirely meaningless check. But if you aren't reading the tests and you just run the test job and see hundreds of green check marks and dozens of classes covered, it could give you a false sense of securityYou are missing the forest for the trees. Sure, we can find flaws in the current generation of LLMs. But they'll be fixed. We have a tool that can learn to do anything as well as a human, given sufficient input.
where's the catch? SQL is an old technology, surely an LLM is good with it