Top
Best
New

Posted by swyx 12/19/2025

LLM Year in Review(karpathy.bearblog.dev)
384 points | 146 commentspage 3
dandelionv1bes 12/20/2025|
Something I’ve been thinking about is how as end stage users (eg building our own “thing” on top of an LLM) we can broadly verify it’s doing what we need without benchmarks. Does a set of custom evals built out over time solve this? Is there more we can do?
swyx 12/19/2025||
xposted to https://x.com/karpathy/status/2002118205729562949
CamperBob2 12/20/2025|
And also accessible sans login via https://xcancel.com/karpathy/status/2002118205729562949 .
distalx 12/20/2025||
Friendly reminder: There is no ghost in the machine. It is a system executing code, not a being having thoughts. Let’s admire the tool without projecting a personality onto it.
skybrian 12/20/2025||
For me, that’s kind of the point. It’s similar to how the characters in a novel don’t really exist, and yet you can’t really discuss what happens in a novel without pretending that they do. It doesn’t really make sense to treat the author’s motivations and each character’s motivations as the same.

Similarly, we’re all talking to ghosts now, which aren’t real, and yet there is something there that we can talk about. There are obvious behavioral differences depending on what persona the LLM is generating text for.

I also like the hint of danger in “talking to ghosts.” It’s difficult to see how a rational adult could be in any danger from just talking, but I believe the news reports that some people who get too deep into it get “possessed.”

ngruhn 12/20/2025|||
Consciousness is weird and nobody understands it. There is no good reason to assume that these systems have it. But there is also no good reason to rule it out.
dr_dshiv 12/20/2025|||
That’s the old way of thinking about it. there is a new way.
squidbeak 12/20/2025||
You sound as if you have grounds for certainty about this. What are they?
metalman 12/20/2025||
find on page:slop=0
ausbah 12/20/2025|
tl;dr seems like llms are maturing on the product side and for day-day usage