2025: The Year in LLMs

Posted by simonw 12/31/2025

2025: The Year in LLMs(simonwillison.net)

940 points | 599 commentspage 6

nasnsjdkd 1/1/2026|

[flagged]

jennyholzer3 1/2/2026||

[flagged]

anonnon 1/1/2026||

[flagged]

dang 1/1/2026||

He's one of the most valuable writers on LLMs, which are one of the major topics at present. That's not spam.

anonnon 1/1/2026|||

> He's one of the most valuable writers on LLMs

Is he, really? Most of his blog posts are little more than opportunistic, buttressing commentary on someone else's blog post or article, often with a bit of AI apologia sprinkled in (for example, marginalizing people as paranoid for not taking AI companies at their word that they aren't aggressively scraping websites in violation of robots.txt, or exfiltrating user data in AI-enbaled apps).

EDIT: and why must he link to his blog so often in his comments? How is that not SEO/engagement farming? BTW dang, I wasn't insinuating the mods were in league with him or anything, just that, IMO, he's long past the point at which good faith should no longer be assumed.

dang 1/1/2026|||

Please stop.

th0ma5 1/1/2026||

I think when a moderator keeps intervening like this it really does mean that there's something wrong here. I think people would be less mad if you just went ahead and said that you have some kind of special arrangement here with this influencer and post publicly that you like them constantly spamming the site and letting their fans flood the place with deflection and appeals for donations to them. Even YouTube had to add a sponsored post disclaimer.

dang 1/1/2026|||

There's no special arrangement. The only issue is clarifying what content is welcome vs. unwelcome on HN. simonw's content is obviously welcome, and this ought to be obvious.

> I think people would be less mad

People aren't mad about this. The vast majority of this community values simonw's contributions, which are well within the sweet spot for material on HN. That's why his material gets upvoted, as minimaxir (no friend of astroturfers) has pointed out elsewhere in this thread: https://news.ycombinator.com/item?id=46451969.

jheez3 1/2/2026|||

Agreed. The preferential treatment is blatant and obvious.

The fact that an old-timer like yourself comes forward and says it means that the newer people aren't nutters thinking it.

simonw 1/1/2026|||

If you're not assuming good faith what are you assuming here? What's my motivation?

"buttressing commentary on someone else's blog post"

That's how link blogs work. I wrote more about my approach to that here: https://simonwillison.net/2024/Dec/22/link-blog/

(And yes, there I go again linking to something I've written from a comment. It's entirely relevant to the point I am making here. That's why I have a blog - so I can put useful information in one place.)

I'll also note that I don't ever share links to my link blog posts on Hacker News myself - I don't think they're the right format for a HN post. I can't help if other people share them here: https://news.ycombinator.com/from?site=simonwillison.net

anonnon 1/1/2026||

> What's my motivation?

Are you really going to insult my and others' intelligence like this? Directly or indirectly, your motivation is money. You already offer monthly subscriptions to your blog, and you're clearly trying to build a monetizable brand for yourself as a leading authority on AI, especially as it pertains to software development.

simonw 1/1/2026|||

If my motivation was money I would cash in on the reputation I've already built and go and land a Silicon Valley salaried job somewhere.

Sponsorship from my monthly newsletter doesn't come close.

Seriously, do you have any idea how much money I'm leaving on the table right now NOT having a real job in this space?

Being a blogger is wildly financially irresponsible!

jheez3 1/2/2026|||

Lol this argument doesn't hold weight.

Do you really want to be an employee? Lets see what your reservation price is first.

Im pretty sure you'd need to be paid a lot to forgo having control over your time and so on. Lets keep it one-hunnid.

simonw 1/2/2026||

That's why I haven't got a job: I enjoy the freedom of working for myself, and I have enough of a financial runway from a previous startup acquisition that I can afford to do my own thing for a few more years.

At some point I'm going to need to get back to earning more than I spend.

rvz 1/1/2026|||

It is promotional spam.

But given the volume of LLM slop, it was kind of obvious and known that even the moderators now have "favourites" over guidelines.

> Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity. [0]

The blog itself is clearly used as promotion all the time when the original source(s) are buried deep in the post and almost all of the links link back to his own posts.

This is now a first on HN and a new low for moderators and as admitted have regular promotional favourites on the top of HN.

[0] https://news.ycombinator.com/newsguidelines.html

minimaxir 1/1/2026||

The operative word there is "primarily". Simon comments on a variety of topics and has far more interactions that don't link to his blog than do.

Simon's posts are not engagement farming by any definition of the term. He posts good content frequently which is then upvoted by the Hacker News community, which should be the ideal for a Hacker News contributor.

rvz 1/1/2026||

Except that the "content" that reaches the top is always about AI / LLMs and nothing else and it is "all the time". Any opportunity to comment, he will link back to his own blog.

He even reposted the same link (which is about AI) with one of his posts when the upvotes fell off and until the second one reached the top, with the intention of promoting his own blog.

Let me simply prove my point to you on how predictable this spam is.

He will do a blog post this month about this paper [0] with an expert analysis by either someone else (or even an LLM) with the primary intention of the blog being used for self promotion with at least one link back to his own blog.

> ...which is then upvoted by the Hacker News community

You don't know that. But what we do know is that even the moderators now have "favourites". Anyone else would be shot down for promotional spam.

[0] https://arxiv.org/abs/2512.24880

simonw 1/1/2026||

"He even reposted the same link (which is about AI) with one of his posts when the upvotes fell off"

Where did I do that?

> He will do a blog post this month about this paper [0]

That paper you linked to is a perfect example of where my approach can add value!

Did you read it? Do you understand what it saying? It is dense.

I would love to read an evaluation of that paper by someone who can rephrase the core ideas and conversations into a couple of paragraphs that help me understand it, and help me figure out if I should invest further effort in learning more.

I have a whole tag on my blog for that kind of content called paper-review: https://simonwillison.net/tags/paper-review/ - it's my version of the TikTok meme "I read X so you don't have to".

Honestly, your problem doesn't seem to be with me so much as it seems to be with the concept of blogging in general.

th0ma5 1/1/2026||

[flagged]

dang 1/1/2026|||

You've posted over 40 replies hounding this one user whom you seem to be fixated on. We've already asked you to stop (https://news.ycombinator.com/item?id=44726957) but you've continued:

https://news.ycombinator.com/item?id=46409736

https://news.ycombinator.com/item?id=46395646

https://news.ycombinator.com/item?id=46209386

This is obviously an abuse of HN, regardless of who you're being aggressive towards. We ban accounts that keep doing this. If you keep doing it, we will ban you, so no more of this please.

simonw 1/1/2026||||

I had to paste that into a separate browser window (jwz blocks Hacker News referral traffic) and I cannot figure out how that story is relevant to this conversation. Did you share the right link?

simonw 1/1/2026|||

Probably because my content gets a lot more upvotes than it does flags.

If this post was by anyone other than me would you have any problems with its quality?

firexcy 1/1/2026||

I appreciate his work for being more informative and organized than average AI-related content. Without his blogging, it would be a struggle to navigate the bombastic and narcissistic Twitter/Reddit posts for AI updates. The barrier to entry for AI reporting is so low that you just need to give a bit more care to be distinguished, and he is getting the deserved attention for doing exactly that in a systematical and disciplined manner. (I do believe many on HN are more than capable but not interested in doing the same.) Personally, I sometimes find his posts more congratulatory or trivial than I like, but I have learned to take what I want and ignore what I don’t.

castwide 1/1/2026||

[flagged]

techpression 1/1/2026||

Nothing about the severe impact on the environment, and the hand waviness about water usage hurt to read. The referenced post was missing every single point about the issue by making it global instead of local. And as if data center buildouts are properly planned and dimensioned for existing infrastructure…

Add to this that all the hardware is already old and the amount of waste we’re producing right now is mind boggling, and for what, fun tools for the use of one?

I don’t live in the US, but the amount of tax money being siphoned to a few tech bros should have heads rolling and I really don’t want to see it happening in Europe.

But I guess we got a new version number on a few models and some blown up benchmarks so that’s good, oh and of course the svg images we will never use for anything.

simonw 1/1/2026|

"Nothing about the severe impact on the environment"

I literally said:

"AI data centers continue to burn vast amounts of energy and the arms race to build them continues to accelerate in a way that feels unsustainable."

AND I linked to my coverage from last year, which is still true today (hence why I felt no need to update it): https://simonwillison.net/2024/Dec/31/llms-in-2024/#the-envi...

jennyholzer3 1/2/2026||

Do you think anything should be done about this environmental impact?

Or should we just keep chugging along as though there is no problem at all?

simonw 1/2/2026||

I think we should continue to find ways to serve this stuff more efficiently - already a big focus of the AI labs because they like making money, and reduced energy bills = more profitable inference.

I also think we should use tax policy to provide financial incentives to reduce the environmental impact - tax breaks for renewables, tax hikes for fossil fuel powered data centers, that kind of thing.

asgR1t 1/1/2026||

Most LLMs got worse in 2025. Only addicts and the type of computer gamer that feels drawn to complex setups, gamification and does not care about the end result will feel positive about the grift.

2025: The Year in Open Source? Nothing, all resources were tied up to debunk a couple of Python web developers who pose as the ultimate experts in LLMs.

simonw 1/1/2026|

In what way did they get worse?

I made you a dashboard of my 2025 writing about open-source that didn't include AI: https://simonwillison.net/dashboard/posts-with-tags-in-a-yea...

yupyupyups 1/1/2026||

Let's talk about the societal cost these models have had on us including their high energy cost and the proliferation of auto-generated slop media used to milk ad revenue, scam people, SEO farm, do propaganda or automate trolling. What about these big corporations collecting an astronomical amount of debt to hoard DRAM and NAND in a way that has crippled the PC market within weeks? And what are they going to do next, put a few dollars in Trump's pocket so that they can rob/loot the US population through bailouts? Who gets to keep all the hardware I wonder?

Nvidia, Samsung, SK Hynix and some other voltures I forgot to mention are making serious bank right now.

jennyholzer3 1/2/2026|

> Who gets to keep all the hardware I wonder?

Keep questions like this off of the propaganda thread.

jama211 1/1/2026|

The difference between the performance of models between 2024 and 2025 has been so stark, that graph really shows it. There are still many people on these forums who seem to think AI’s produce terrible code unless ultra supervised, and I can’t help but suspect some of them tried it a little while ago and just don’t understand how different it is now compared to even quite recently.

Madmallard 1/1/2026|

I used Gemini Pro, Claude Pro yesterday a couple of dozen times and basically have been daily.

I have a project to convert my multiplayer XNA game from C# to Javascript and to add networking to the game-play using LLMs.

They are far worse at it now than they were a year ago. They actually implemented the requirements (Though inaccurately) to the best of their ability a year ago. Especially Gemini.

Now they don't even come remotely close to implementing just the basic requirements.

The thing is, I'm giving them the entirety of the C# source code and spelling out what they should do.

simonw 1/1/2026|||

Weird. I would expect Gemini 3 Pro and Claude Opus 4.5 to run rings around Gemini 1.5 Pro and Claude Sonnet 3.5.

How are you running them - regular chat interface or do you have them setup with Claude Code or Gemini CLI?

Madmallard 1/1/2026||

Using the chat interface primarily with various prompting strategies.

I am considering making a thread where I compel others to attempt to get what I'm trying to get out of it and show me their work.

The game is only around 25000-30000 LOC in C#.

simonw 1/1/2026||

I'd be happy to join such a thread.

jennyholzer3 1/2/2026|||

"They are far worse at it now than they were a year ago."

This is the part they REALLY don't want you to say.

They can no longer train these models effectively and their performance is slipping. Late 2023 was the golden age.