Posted by lsdmtme 15 hours ago
I point it to example snippets and webdocumentation but the code it gens won't work at all, not even close
Opus4.6 is a tiny bit less wrong than Codex 5.4 xhigh, but still pretty useless.
So, after reading all the success stories here and everywhere, I'm wondering if I'm holding it wrong or if it just can't solve everything yet.
Xhigh can also perform worse than High - more frequent compaction, and "overthinking".
Such as:
Adding fine curl noise to a volumetric smoke shader
Fixing an issue with entity interpolation in an entity/snapshot netcode
Find some rendering bugs related to lightmaps not loading in particular cases, and it actually introduced this bug.
Just basic stuff.
That sort of GPU code has a lot of concepts and machinery, it’s not just a syntax to express, and everything has to be just right or you will get a blank screen. I also use them differently than most examples; I use it for data viz (turning data into meshes) and most samples are about level of detail. So a double whammy.
But once I pointed either LLM at my own previous work — the code from months of my prior personal exploration and battles for understanding, then they both worked much better. Not great, but we could make progress.
I also needed to make more mini-harnesses / scaffolds for it to work through; in other words isolating its focus, kind of like test-driven development.
That's also how you can get the LLM to do stuff outside of the training data in a reasonably good way, by not just including the _what_ in the prompt, but also the _how_.
(Don’t get mad at me, I’m a webshit developer)
Obviously it cannot. But if you give the AI enough hints, clear spec, clear documentation and remove all distracting information, it can solve most problems.
What you're doing is more specialized and these models are useless there. It's not intelligence.
Another NFT/Crypto era is upon us so no you're not holding it wrong.
One of these is better.
https://www.newyorker.com/magazine/2026/04/13/sam-altman-may...
You know, you can just google his name yourself, don't you?
</tinfoil>
Meanwhile their 'best' competitor just announced they want to provide unreliable mass destruction guidance tools but they don't wanna feel said.
Honestly speaking, we are wrong whenever we do business with this sort of people
FWIW that's what most TOSes say for the majority of online services. Some even include arbitration clauses to prevent civil suits and class-action cases.
[0] https://www.dwarkesh.com/i/187852154/004620-if-agi-is-immine...
This tends to happen during pretraining phase of new models
Happened with 3.x too
All the news i hear about this company for the past weeks made it sound like they're really desperate.
But more likely they are constrained on GPUs and can't get them fast enough.
(My guess having no understanding of how this industry actually works.)
I canceled my subscription and switched to a codex, but it's not as good. I'm tired of Anthropic changing things all the time. I use Claude because it doesn't redirect you to a different model like OpenAI does. But now it seems like both companies are doing the same thing in different way.