Top
Best
New

Posted by simonw 1 day ago

2025: The Year in LLMs(simonwillison.net)
851 points | 503 commentspage 6
castwide 1 day ago|
[flagged]
techpression 1 day ago||
Nothing about the severe impact on the environment, and the hand waviness about water usage hurt to read. The referenced post was missing every single point about the issue by making it global instead of local. And as if data center buildouts are properly planned and dimensioned for existing infrastructure…

Add to this that all the hardware is already old and the amount of waste we’re producing right now is mind boggling, and for what, fun tools for the use of one?

I don’t live in the US, but the amount of tax money being siphoned to a few tech bros should have heads rolling and I really don’t want to see it happening in Europe.

But I guess we got a new version number on a few models and some blown up benchmarks so that’s good, oh and of course the svg images we will never use for anything.

simonw 1 day ago|
"Nothing about the severe impact on the environment"

I literally said:

"AI data centers continue to burn vast amounts of energy and the arms race to build them continues to accelerate in a way that feels unsustainable."

AND I linked to my coverage from last year, which is still true today (hence why I felt no need to update it): https://simonwillison.net/2024/Dec/31/llms-in-2024/#the-envi...

asgR1t 12 hours ago||
Most LLMs got worse in 2025. Only addicts and the type of computer gamer that feels drawn to complex setups, gamification and does not care about the end result will feel positive about the grift.

2025: The Year in Open Source? Nothing, all resources were tied up to debunk a couple of Python web developers who pose as the ultimate experts in LLMs.

simonw 12 hours ago|
In what way did they get worse?

I made you a dashboard of my 2025 writing about open-source that didn't include AI: https://simonwillison.net/dashboard/posts-with-tags-in-a-yea...

yupyupyups 20 hours ago||
Let's talk about the societal cost these models have had on us including their high energy cost and the proliferation of auto-generated slop media used to milk ad revenue, scam people, SEO farm, do propaganda or automate trolling. What about these big corporations collecting an astronomical amount of debt to hoard DRAM and NAND in a way that has crippled the PC market within weeks? And what are they going to do next, put a few dollars in Trump's pocket so that they can rob/loot the US population through bailouts? Who gets to keep all the hardware I wonder?

Nvidia, Samsung, SK Hynix and some other voltures I forgot to mention are making serious bank right now.

jama211 20 hours ago|
The difference between the performance of models between 2024 and 2025 has been so stark, that graph really shows it. There are still many people on these forums who seem to think AI’s produce terrible code unless ultra supervised, and I can’t help but suspect some of them tried it a little while ago and just don’t understand how different it is now compared to even quite recently.
Madmallard 9 hours ago|
I used Gemini Pro, Claude Pro yesterday a couple of dozen times and basically have been daily.

I have a project to convert my multiplayer XNA game from C# to Javascript and to add networking to the game-play using LLMs.

They are far worse at it now than they were a year ago. They actually implemented the requirements (Though inaccurately) to the best of their ability a year ago. Especially Gemini.

Now they don't even come remotely close to implementing just the basic requirements.

The thing is, I'm giving them the entirety of the C# source code and spelling out what they should do.

simonw 9 hours ago||
Weird. I would expect Gemini 3 Pro and Claude Opus 4.5 to run rings around Gemini 1.5 Pro and Claude Sonnet 3.5.

How are you running them - regular chat interface or do you have them setup with Claude Code or Gemini CLI?

Madmallard 8 hours ago||
Using the chat interface primarily with various prompting strategies.

I am considering making a thread where I compel others to attempt to get what I'm trying to get out of it and show me their work.

The game is only around 25000-30000 LOC in C#.

simonw 5 hours ago||
I'd be happy to join such a thread.