Posted by leerob 10/29/2025
SWE-grep was able to hit ~700tokens/s and Cursor ~300token/s, hard to compare the precision/recall and cost effectiveness though, considering SWE-grep also adopted a "hack" of running it on Cerebras.
I'm trying to kickstart a RL-based code search project called "op-grep" here[1], still pretty early, but looking for collaborators!
[0]: https://cognition.ai/blog/swe-grep [1]: https://github.com/aperoc/op-grep
I’m assuming major release vs stable, but this is pretty lackluster so far. Switched back to Sonnet reasoning. Here’s to improving!
Do you have to split the plan in parallelizable tasks that could be worked in parallel in one codebase without breaking and confusing the other agents?
I think competition in the space is a good thing, but I'm very skeptical their model will outperform Claude.
[1] https://www.businessinsider.com/no-shoes-policy-in-office-cu...
Still not up to Cursor standards though :)
Cursor's tab completion is better, but it doesn't seem to have a concept of not trying to tab complete. IntelliJ is correct half the time for completing the rest of the line and only suggests when it is somewhat confident in its answer.
It made migrating for everyone using VSCode (probably the single most popular editor) or another vscode forked editor (but at the time it was basically all VSCode) as simple as install and import settings.
I do not think Cursor would have done nearly as well as it has if it didn't. So even though it can be subpar in some areas due to VSCodes baggage, its probably staying that way for a while.
Maybe my complaint is that I wish vscode had more features like intellij, or that intellij was the open source baseline a lot of other things could be built on.
Intellij is not without its cruft and problems, dont get me wrong. But its git integration, search, navigation, database tools - I could go on - all of these features are just so much nicer than what vscode offers.
Looking at the graph, it would appear there's an implicit "today" in that statement, as they do appear poised to equal or surpass Sonnet 4.5 on that same benchmark in the near future.