Top
Best
New

Posted by yakkomajuri 9 hours ago

The last six months in LLMs in five minutes(simonwillison.net)
440 points | 305 commentspage 4
hansmayer 1 hour ago|
TL;DR:

"Coding agents got really good - here, a bunch of non-releavant slop-pictures of pelicans riding bikes as a key benchmark AND a couple of hardly relevant edge-case demo-projects of mine to prove it right! "

Come on man, where is the AI writing all the code in 6 months? We're close to June and Amodei's latest statement from January does not look like going into fulfilling over the next weeks, does it now?

tayo42 6 hours ago||
The claw thing really came and went fast lol
yieldcrv 5 hours ago||
I just started a new job and the person I report to was just excited to tell me about it, here in Mid May

"and then you have to get a mac mini, and then, and then"

smile and nod, it pays weekly

viking123 2 hours ago||
I mean yeah? It was marketing campaign to boost the model providers and give Steinberger a cozy job at OpenAI. Hook, line and sinker.

Wake me up when we have an agent with constant learning and changing weights that I can have personally, not some LLM that can always fall prone to jailbreak and context injection attacks.

You think most of this stuff here is organic? Oh boy..

DeathArrow 6 hours ago||
I think that there's a lot to be improved in harnesses and the way the models are interacting with harnesses. For example, the harness should be able to steer the model when thinking.
aizk 7 hours ago||
I'm so glad Simon is documenting this. The field is evolving so fast, so rapidly, so hungry for data and money, that few are willing to zoom out and document everything big picture so we can see the changes over time. I mean do you guys remember "Do anything now"? Just a distant memory, a funny party trick.
limack0 1 hour ago||
[flagged]
raghavchamadiya 4 hours ago||
[flagged]
hmaddipatla 9 hours ago||
[dead]
nothinkjustai 4 hours ago||
[dead]
nothinkjustai 4 hours ago|||
[flagged]
nothinkjustai 4 hours ago|
[flagged]
rhubarbtree 3 hours ago|
Certainly a massive AI booster. What Are the conflicts of interest?
More comments...