Posted by bilsbie 6/30/2025
It can probably remember more facts about a topic than a PhD in that topic, but the PhD will be better at thinking about that topic.
"Thinking" is too broad a term to apply usefully but I would say its pretty clear we are not close to AGI.
Why should the model need to memorize facts we already have written down somewhere?
So can a notebook.
What about simulation: models can make 3D objects so why not give them a physics simulator? We have amazing high fidelity (and low cost!) game engines that would be a great building block.
What about rumination: behind every Cursor rule for example, is a whole story of why a user added it. Why not take the rule, ask a reasoning model to hypothesize about why that rule was created, and add that rumination (along with the rule) to the training data. Providing opportunities to reflect on the choices made by their users might deepen any insights, squeezing more juice out of the data.
We let models write code and run it. Which gives them a high chance of getting arithmetic right.
Solving the “crossing the river” problem by letting the model create and run a simulation would give a pretty high chance of getting it right.
https://docs.anthropic.com/en/docs/agents-and-tools/tool-use...
Each Cursor rule is a byproduct of tons of work and probably contains lots that can be unpacked. Any research on that?
This is easier said than done though because this value function is so noisy it's often hard to learn from it. And also whether or not a response (the model output) matches the value function (the Cursor rules) is not even that easy to grade. It's been easier to train the chain-of-thought style reasoning since one can directly score it via the length of thinking.
This new paper covers some of the difficulties of language-based critic models: https://openreview.net/pdf?id=0tXmtd0vZG
Generally speaking, the algorithm and approach is not new. Being able to do it in a reasonable amount of compute is the new part.
Do that for a bunch of rules scraped from a bunch of repos - and you’ve got yourself a dataset for training a new model with - or maybe for fine tuning.
The original idea of connectionism is that neural networks can represent any function, which is the fundamental mathematical fact. So we should be optimistic, neural nets will be able to do anything. Which neural nets? So far people have stumbled on a few productive architectures, but it appears to be more alchemy than science. There is no reason why we should think there won't be both new ideas and new data. Biology did it, humans will do it too.
> we’re engaged in a decentralized globalized exercise of Science, where findings are shared openly
Maybe the findings are shared, if they make the Company look good. But the methods are not anymore
That's not super relevant in my mind. It's because they're showing fruit now that will allow research to move forward. And the success, as we know, draws a lot of eyeballs, dollars, resources.
If this path was going to hit a wall, we will hit it more quickly now. If another way is required to move forward, we are more likely to find it now.
Just a hypothesis of mine