Posted by shloked 2 days ago
good (if superficial) post in general, but on this point specifically, emphatically: no, they do not -- no shade, nobody does, at least not in any meaningful sense
There is a lot left to learn about the behaviour of LLMs, higher-level conceptual models to be formed to help us predict specific outcomes and design improved systems, but this meme that "nobody knows how LLMs work" is out of control.
LLMs are understood to the extent that they can be built from the ground up. Literally every single aspect of their operation is understood so thoroughly that we can capture it in code.
If you achieved an understanding of how the human brain works at that level of detail, completeness and certainty, a Nobel prize wouldn't be anywhere near enough. They'd have to invent some sort of Giganobel prize and erect a giant golden statue of you in every neuroscience department in the world.
But if you feel happier treating LLMs as fairy magic, I've better things to do than argue.
I don't have an inherent understanding of English, although I use it regularly.
Treating LLMs as fairy magic doesn't make me feel any happier, for whatever it's worth. But I'm not interested in arguing either.
I never intended to make any claims about how well the principles of LLMs can be understood. Just that none of that understanding is inherent. I don't know why they used that word, as it seems to weaken the post.
This is likely (certainly?) impossible. So not a useful definition.
Meanwhile, I have observed a very clear binary among people I know who use LLMs; those who treat it like a magic AI oracle, vs those who understand the autoregressive model, the need for context engineering, the fact that outputs are somewhat random (hallucinations exist), setting the temperature correctly...
"we" are not, what i quoted and replied-to did! i'm not inventing strawmen to yell at, i'm responding to claims by others!
Running LLMs is expensive and we can swap models easily. The fight for attention is on, it acts like an evolutionary pressure on LLMs. We already had the sycophantic trend as a result of it.
Rather: use your time to learn serious, deep knowledge instead of wasting your time reading (and particularly: spreading) the science-fiction stories the AI bros tell all the time. These AI bros are insanely biased since they will likely loose a lot of money if these stories turn out to be false, or likely even if people stop believing in these science-fiction fairy tales.
Has anyone experimented with deliberately structuring prompts to take advantage of these memory patterns?
Yes, I too imagine these "more technical users" spamming rocketship and confetti emojis absolutely _celebrating_ the most toxic code contributions imaginable to some of the most important software out there in the world. Claude is the exact kind of engineer (by default) you don't want in your company. Whatever little reinforcement learning system/simulation they used to fine-tune their model is a mockery of what real software engineering is.