Composer: Building a fast frontier model with RL

Posted by leerob 10/29/2025

Composer: Building a fast frontier model with RL(cursor.com)

215 points | 168 comments

cwyers 10/29/2025|

The lack of transparency here is wild. They aggregate the scores of the models they test against, which obscures the performance. They only release results on their own internal benchmark that they won't release. They talk about RL training but they don't discuss anything else about how the model was trained, including if they did their own pre-training or fine-tuned an existing model. I'm skeptical of basically everything claimed here until either they share more details or someone is able to interpedently benchmark this.

criemen 10/29/2025||

I understand where you're coming from, and I'd love to have learned about pre-training vs. off-the-shelf base model too. But

> their own internal benchmark that they won't release

If they'd release their internal benchmark suite, it'd make it into the training set of about every LLM, which from a strictly scientific standpoint, invalidates all conclusions drawn from that benchmark from then on. On the other hand, not releasing the benchmark means they could've hand-picked the datapoints to favor them. It's a problem that can't be resolved unfortunately.

cwyers 10/30/2025|||

I'm not saying SWE-Bench is perfect, and there are reports that suggest there is some contamination of training sets for LLMs with common benchmarks like SWE-Bench. But they publish SWE-bench so anyone can run it and have an open leaderboard where they attribute the results to specific models, not just vague groupings:

https://www.swebench.com/

ARC-AGI-2 keeps a private set of questions to prevent LLM contamination, but they have a public set of training and eval questions so that people can both evaluate their modesl before submitting to ARC-AGI and so that people can evalute what the benchmark is measuring:

https://github.com/arcprize/ARC-AGI-2

Cursor is not alone in the field in having to deal with issues of benchmark contamination. Cursor is an outlier in sharing so little when proposing a new benchmark while also not showing performance in the industry standard benchmarks. Without a bigger effort to show what the benchmark is and how other models perform, I think the utility of this benchmark is limited at best.

nickpsecurity 10/30/2025|||

In high-security systems, we solved this problem with trusted, independent evaluators who got all the data. They replicate the results themselves. They analyze every artifact for flaws. They also pen test the system offensively. If they say it's good, then maybe it is good or maybe less, obviously bad.

We could have third-party groups with evaluation criteria who don't make models or sell A.I.. Strictly evaluators. Alternatively, they have a different type of steady income with the only A.I. work they're doing being evaluation.

infecto 10/30/2025|||

Disagree. The ultimate bar which is easily measurable, do users find value in it. Benchmarks are mostly meaningless especially in my opinion where cursor shines which is the tool chain. You can go try composer yourself today and see if it’s valuable to you.

diggan 10/30/2025||

Isn't that up to the reader/visitor/user to decide? As it stands right now, Cursor are publishing results they won't say how they got them, and compares them against aggregate scores we don't know the true results of, and you're saying "it doesn't matter, the tool is better anyways".

Then why publish the obscured benchmarks in the first place then?

infecto 10/30/2025||

No I said I don’t believe any of the existing benchmarks do well when it comes to using a tool chain. They built a model specifically to be used with their tool chain calls, something that a lot of the models out there struggle with.

NitpickLawyer 10/30/2025||

Does it really matter tho? At the end of the day, what matters most is if real users find it useful or not. And cursor has that data (both historically and in real-time). Thousands of accepts/rejects >>> any benchmark that you can come up with. That should allow them to iterate on it, and make it better, eventually.

Benchmarks have become less and less useful. We have our own tests that we run whenever a new model comes out. It's a collection of trivial -> medium -> hard tasks that we've gathered, and it's much more useful to us than any published table. And it leads to more interesting finds, such as using cheaper models (5-mini, fast-code-1, etc) on some tasks vs. the big guns on other tasks.

I'm happy to see cursor iterate, as they were pretty vulnerable to the labs leaving them behind when all of them came out with coding agents. The multi-agents w/ built in git tree support is another big thing they launched recently. They can use their users as "teacher models" for multiple completions by competing models, and by proxying those calls, they get all the signals. And they can then use those signals to iterate on their own models. Cool stuff. We actually need competing products keeping eachother in check, w/ the end result being more options for us, and sometimes even cheaper usage overall.

jonasnelle 10/29/2025||

Cursor has the best Tab model, and I feel like their lead there has kept growing - they're doing some really cool things there. https://cursor.com/blog/tab-rl

I wonder how much the methods/systems/data transfer, if they can pull off the same with their agentic coding model that would be exciting.

enraged_camel 10/29/2025||

Tab model is fantastic but I wish it was somehow aware of the conversation happening in the currently active AI chat session.

srush 10/29/2025|||

We also are big Tab users here at Cursor. In the blog we talk about the motivation for this project came from thinking about a Tab-like agent.

oersted 10/30/2025|||

I agree, I tried to switch to Zed this week, and I prefer it in all respects, but the tab model is much worse, and it made me switch back. I never imagined I would care so much about a feature I felt was secondary.

I actually find myself using the agent mode less now, I like keeping code lean by hand and avoid technical debt. But I do use the tab completions constantly and they are fantastic now ever since they can jump around the file.

vidarh 10/29/2025|||

I feel like that's like having a lead in producing better buggy whips.

I run Claude Code in the background near constantly for a variety of projects, with --dangerously-skip-permissions, and review progress periodically. Tabbing is only relevant when it's totally failing to make progress and I have to manually intervene, and that to me is a failure scenario that is happening less and less often.

camdenreslink 10/29/2025|||

What are you building with this workflow? Is it an application live in production with users? It is such a foreign way of working to me.

vidarh 10/29/2025|||

A compiler (hobby project). A web application server (tooling for my consultancy). An agentic framework to part-automate end-to-end development of a large web app (customer project). An analytics platform to analyze infrastructure maturity (customer project).

Usually I'll have several Claude Code sessions running in parallel on different projects, and when one of them stops I will review the code for that project and start it again - either moving forwards or re-doing things that have issues.

anonzzzies 10/30/2025|||

We build mostly everything with this workflow, and we indeed have a lot of paid applications in production with users. Most what we do is SaaS. We do have rigid human code reviews though.

lubujackson 10/29/2025|||

This is just a completely different use of LLMs and has little to do with working at a real business with a live site and users. Cursor is great when you want to gain understanding of an issue quickly, or resolve something clear and specific quickly.

I'm not against YOLO vibe coding, but being against tab completion is just insane to me. At the end of the day, LLMs help you achieve goals quicker. You still need to know what goal you want to achieve, and tab completion basically let's me complete a focused goal nearly as soon as I determine what my goal is.

vidarh 10/29/2025||

Some of these projects are at a "real business with a live site and users". Two of the current ones are.

And it's not remotely "YOLO vibe coding". All the code gets reviewed, and tested thoroughly, and they are worked to specs, and gated by test suites.

What I don't do is babysit the LLM until it's code passes both the test suite and automated review stages, because it's a waste of time.

Others of these projects are research tasks. While I wrote this comment, Claude unilaterally fixed a number of bugs in a compiler.

mbrock 10/30/2025||

To be clear, Claude probably also introduced the bugs?

I tried to use an appropriate emoji to express the joking nature of this comment, but HN silently filtered it out, so pretend you see a grinning face.

vidarh 10/30/2025||

No, Claude did not introduce the bugs. I caused the bugs, years ago, and didn't have time to pursue the project for a long time. Claude fixed them by being handed unfinished, broken code and a test suite and told to make the tests pass.

mbrock 10/31/2025||

Ah, that's great. I've also found LLM agents extremely helpful for reviving old projects.

dagss 10/29/2025|||

It's great. BUT: Wish they had selected another shortcut like shift+tab.

Every time I write code myself I find myself racing the AI to get an indentation in before the AI is done... gets annoying

RosalieCodes 10/29/2025||

You can change the key bind, I personally set it to ctrl+tab

typpilol 10/30/2025|||

What makes it and different from vscodes copilot completions?

TiredOfLife 10/30/2025||

Have you tried Windsurfs?

srush 10/29/2025||

Hi everyone,

I am an ML researcher at Cursor, and worked on this project. Would love to hear any feedback you may have on the model, and can answer question about the blog post.

coder543 10/29/2025||

Impressive systems write-up. A question: if Composer is an RL finetune on an open model, why keep weights closed? The edge from a slightly better checkpoint erodes quickly in this market, it's not a durable advantage. Composer protects Cursor's margins from being squeezed by the big AI labs, but that is true whether the weights are open or closed, and I think Cursor would have more lasting benefit by generating developer goodwill than from a narrow, short-lived advantage. But, that's just my opinion. I personally find it hard to get excited about yet-another proprietary model. GPT-5 and Sonnet 4.5 are around when I need one of those, but I think the future is open.

Agingcoder 10/30/2025|||

It's stunning.

I don't use these tools that much ( I tried and rejected Cursor a while ago, and decided not to use it ) but having played with GPT5 Codex ( as a paying customer) yesterday in regular VSCode , and having had Composer1 do the exact same things just now, it's night and day.

Composer did everything better, didn't stumble where Codex failed, and most importantly, the speed makes a huge difference. It's extremely comfortable to use, congrats.

Edit: I will therefore reconsider my previous rejection

srush 10/30/2025||

Awesome to hear, I will share with the team.

WanderPanda 10/29/2025|||

Why did you stop training shy of the frontier models? From the log plot it seems like you would only need ~50% more compute to reach frontier capability

srush 10/29/2025||

We did a lot of internal testing and thought this model was already quite useful for release.

WanderPanda 10/29/2025||

Makes sense! I like that you guys are more open about it. The other labs just drop stuff from the ivory tower. I think your style matches better with engineers who are used to datasheets etc. and usually don't like poking a black box

srush 10/29/2025||

Thanks! I do like the labs blog posts as well though, OpenAI and Anthropic have some classics.

chaidhat 10/29/2025|||

Which model did you distill it from? Great work! PS getting a few scenarios where it doesn't follow rules as well as sonnet 4.5

srush 10/29/2025||

The blog talks about the training process. Specifically we trained with RL post-training on coding examples.

chis 10/29/2025|||

Makes sense, but what model was used for the base? Is it some open-source model, and you're not at liberty to disclose?

W0WL0LXD 10/31/2025||

not a Cursor employee but still a researcher, it’s Zhipu/Z.ai GLM-4.6/4.5. there’s traces of Chinese in the reasoning output + its the only model that would make sense to do this with RL, and is a model that already delivers near SOTA performance + is open-source/open-weight.

Cursor Composer and Windsurf SWE 1.5 are both finetuned versions of GLM.

chaidhat 11/1/2025||

interesting, thank you

chaidhat 10/29/2025|||

that's cool thanks!

embedding-shape 10/29/2025|||

Do you have any graphs handy that kind of replicates the one used first in the blog post but a bit less ambiguous, maybe without model grouping? I feel like it would have been a bit more fair to include proper names, and individualize them rather than group everything together by something, and then present your own model on its own.

alyxya 10/29/2025|||

Is the new model trained from scratch? What training data went into it?

dfltr 10/29/2025|||

Is it true that Cheetah is Grok Code Fast 2? Does this mean that the new Cursor model is also based on Grok?

srush 10/29/2025||

Cheetah was an earlier (and dumber) version of this model that we used to test production speed. They are both developed in-house. If you liked Cheetah, give this model a try.

carlosbaraza 10/29/2025|||

This is nice. I liked Cheetah for grunt work that I want to get out quickly and is not too hard. The speed is really awesome. A model that would run at even higher speeds like the OSS models at groq/cerebras would really be workflow changing, because the slowness of SOTA models really breaks the flow. I find myself taking a ton of breaks and getting distracted while I wait for a model to complete a task (e.g. just now).

srush 10/29/2025||

Let us know how you like it.

dfltr 10/29/2025|||

Awesome, thanks for the clarification. So are the rumors around Cheetah being based on a Grok model just straight up untrue? I want to try Composer but have a pretty strict no X/Grok policy.

srush 10/29/2025||

Straight up untrue.

MysticFear 10/29/2025|||

There is a youtube livestreamer building with it now, if you are looking for direct feedback: https://www.youtube.com/watch?v=1bDPMVq69ac

srush 10/29/2025||

neat!

dlojudice 10/29/2025|||

Congratulations on your work. I spent the day working with a mix of the Composer/Sonnet 4.5/Gemini 2.5 Pro models. In terms of quality, the Composer seems to perform well compared to the others. I have no complaints so far. I'm still using Claude for planning/starting a task, but the Composer performed very well in execution. What I've really enjoyed is the speed. I had already tested other fast models, but with poor quality. Composer is the first one that combines speed and quality, and the experience has been very enjoyable to work with.

juanma0216 10/29/2025|||

I prefer the approach of focusing on faster models despite their lower intelligence because I want my IDE to fly when I can see the code. I find this useful when I need to manually debug something that any model is able to do, so I know it's going to fail but at least it will fail fast. On the other hand, if I need more intelligence I have my other CLI that doesn't allow me to see the code but gets the planning and difficult code done.

srush 10/29/2025||

Our view is that there is a now a minimal amount of intelligence that is necessary to be productive, and that if you can pair that with speed that is awesome.

nickpsecurity 10/30/2025||

What's funny is there's many industries outside A.I. that pick their talent the same way. ;)

pdeva1 10/29/2025|||

is Composer a fine tune of an existing open source base model?

srush 10/29/2025||

Our primary focus is on RL post-training. We think that is the best way to get the model to be a strong interactive agent.

comex 10/29/2025||

So, yes, but you won’t say what the base model is? :)

typpilol 10/30/2025||

It seems like a sort of sonnet model as a lot of people are reporting it like to spam documentation on Twitter like sonnet 4.5

smg 10/29/2025|||

Can you please tell us more about how you used Ray for setting up the RL infrastructure?

srush 10/29/2025||

Oh good question. Actually speaking at the Ray Summit next week in SF so we will talk more about it. We used Ray throughout the pipeline for running evals, for the RL controller, for data collation, and for visualizations. One tool we found helpful was Ray Data which let us easily scale over data and run logs.

nvartolomei 10/29/2025||

Please share more about Ray Data use case.

srush 10/29/2025||

We use Ray data for our map-style processing jobs. For example one tool have runs over all the rollouts from the RL system and collects qualitative statistics to understand which type of agent trajectories are being reward, and what types of searches and terminal commands are being made.

ripped_britches 10/29/2025|||

Amazing work! The UX is great.

GPT-5-codex does more research before tackling a task, that is the biggest weakness for me not using Composer yet.

Could you provide any color on whether ACP (from zed) will be supported?

az226 10/29/2025|||

How many times have you needed to reset the optimizer during the RL training cycles?

carlosbaraza 10/29/2025||

How do you work with multiple agents?

srush 10/29/2025||

We train with a single agent. is that the question?

jasonjmcghee 10/29/2025||

Maybe I'm an outlier but Sonnet 4.5 quality is about as low as I'm willing to go.

It's generation speed is not the problem or the time sink.

It's wrestling with it to get the right output.

---

And just to clarify as maybe I misunderstood again but people are comparing cursor to Claude Code and codex etc here- isn't this whole article all cursor just using different models?

swyx 10/29/2025||

> Sonnet 4.5 quality is about as low as I'm willing to go.

literally a 30 day old model and you've moved the "low" goalpost all the way there haha. funny how humans work

jasonjmcghee 10/29/2025|||

Yup - just like sibling comment said - my "low bar" is going to be whatever the best model is that isn't unreasonably costly/expensive.

Speed of model just isn't the bottleneck for me.

Before it I used Opus 4.1, and before that Opus 4.0 and before that Sonnet 4.0 - which each have been getting slightly better. It's not like Sonnet 4.5 is some crazy step function improvement (but the speed over Opus is definitely nice)

JanSt 10/30/2025||

I think Opus 4.1 is still much better than Sonnet 4.5

jasonjmcghee 10/31/2025||

If cost is not considered- absolutely. That being said sonnet 4.5 and using thinking where it makes sense feels like way more bang for your buck and usually good enough. I really don't use opus anymore

vidarh 10/29/2025||||

Yes? Because why should we settle for less now that it is available?

swyx 10/29/2025||

because engineering is the art of "good enough" and composer is clearly "good enough but a lot faster" which makes up for intelligence gaps in interesting ways

tomashubelbauer 10/30/2025|||

For me the bar for barely good enough is and always has been Codex. Before I found frontier models more trouble than they're worth. And there is still a massive amount of room to grow before I can genuinely say working with these tools is more enjoyable than frustrating for me and now I use them (and how I think they should work).

vidarh 10/29/2025|||

It's not good enough for a lot of us, though, clearly.

leemoore 10/30/2025|||

Not sure about parent, but my current bar is set by GPT-5 high in codex cli. Sonnet 4.5 doesn't quite get there in many of the use cases that are important to me. I still use sonnet for most less intelligence phases and tasks (until I get crunched by rate limits). But when it comes to writing the final coding prompt and the final verification prompt and executing a coder or a verifier that will execute and verify well it's GPT 5 high all the way. Even if sonnet is better at tool calling, GPT 5 High is just smarter and has better coding/engineering judgement and that difference is important to me. So I very much get the sentiment of not going below sonnet intelligence 4.5 for coding. It's where I draw the line too.

alyxya 10/29/2025|||

There’s two different kinds of users, on one side people are more hands off and want the model to autonomously handle longer tasks on its own with minimal guidance, and on the other side is users who want to interactively collaborate with the model to produce desired results. Speed matters much more for the second case, where you know what you want and just want the model to implement whatever you had in mind as quick as possible. Intelligence/ability matters more for the first case when you don’t have full understanding of all the code. I think it’s context dependent for me where more serious work tends to be more interactive. The intelligence of a model doesn’t make up for issues due to lack of context to me.

jasonjmcghee 10/29/2025||

I'm very solidly in the second group - but I review all the code. If it writes faster than I can read, that's fast enough.

srush 10/29/2025|||

Agree that Sonnet 4.5 is an excellent model. Would be curious to hear your experience using Composer though, it's quite good.

jasonjmcghee 10/29/2025|||

I'll try it out! I haven't yet - just generally conveying my opinion that I personally weigh "better model" much more important than speed, assuming some "fast enough"

Also, didn't realize you worked at Cursor - I'm a fan of your work - they're lucky to have you!

srush 10/29/2025||

Thanks! Yeah, been working here for 9 months now. Fascinated byt agentic coding both as a researcher and user.

Totally agree that "smart model" is the table stakes for usefulness these days.

timcobb 10/30/2025|||

> Composer though, it's quite good

Wow, no kidding. It is quite good!

timcobb 10/29/2025|||

Same... I've found that using a non-Claude model just ends up being more expensive and not worth it. "Auto" tokens are hardly free, and I've had plenty of experiences putting "Auto" to work on a "simple" seeming task to have it consume like 1 USD of tokens quite quickly while producing nothing of value, when I'd replay with Claude 4.5 Sonnet non-thinking and it would provide a solid solution for 0.5 USD.

solarkraft 10/29/2025|||

The reason I pulled out the comparison is to highlight how serious they are about all the important parts that make or break the AI coding experience - speed being very important to me. I’d rather catch my model doing the wrong thing quickly than having a higher chance of one-shotting it at the cost of having to do a lot of specification upfront.

NaomiLehman 10/29/2025||

gpt-5-high is as low as i can go :]

stared 10/29/2025||

While I am excited to see a new model, I am skeptical when there is so much vagueness - charts with "frontier models" without actually spelling out which ones, charts with no numbers (time axis, or in one chart - entirely).

srush 10/29/2025|

There is a footnote that should help with the models. Training is a harder thing to report on, but roughly our finding here is that RL scales.

solarkraft 10/29/2025||

People on here love to be contrarian about Cursor, but I’ve tried all the popular alternatives (Copilot, Claude Code, Codex, Gemini CLI, Cline) and found Cursor’s overall experience to just be unmatched. A big part of that is its speed, another its reliability.

It’s the only coding agent I’m actually really motivated to use out of the box because it really does make me feel more productive while the others keep messing up the project, from way too large changes I didn’t ask for all the way to constant syntax and request errors.

It’s the only coding agent I’ve used that feels serious about being a product rather than a prototype. Their effort in improving their stack is totally paying off.

pqdbr 10/29/2025||

I dropped cursor for the precise reason you mention: reliability.

Countless times my requests in the AI chat just hang there for 30+ seconds more until I can retry them.

When I decided to give Claude Code a try (I thought I didn't need it because I used Claude in Cursor) I couldn't believe how faster it was, and literally 100% reliable.

EDIT: given today's release, decided to give it a go. The Composer1 model _is_ fast, but right at the second new agent I started I got this:

> Connection failed. If the problem persists, please check your internet connection or VPN

infecto 10/29/2025|||

Sounds like you have a network problem. Did you try checking the network diagnostic in settings? They default to http2 which can throw a wrench in some corporate networks.

I would be willing to bet money your issue is on your side. I am a daily user since the beginning and cannot recall when I have had issues like you describe unless it was related to my corp network.

davidgomes 10/29/2025||||

A lot of progress is being made here on the Cursor side I encourage you to try it again.

(Cursor dev)

cleak 10/29/2025||||

This is the exact reason I left Cursor for Claude Code. Night and day difference in reliability. The Windows experience might be especially bad, but it would get constantly hung or otherwise fail when trying to run commands. I also had to babysit Cursor and tell it to continue for mid sized tasks.

jonasnelle 10/29/2025||

They've improved performance dramatically in the last few weeks, might have fixed your issues.

hobs 10/29/2025||

Its clear they've been shipping a lot of windows updates.

cleak 10/30/2025||

It does seem significantly better on Windows. I'll give it another chance over the next couple weeks.

chasebank 10/29/2025|||

I use cursor daily, my business partner uses CC. Without a doubt, CC is certainly better, I'm just not willing to let go of the flow I spent the last year fine tuning. I'll probably make the leap after we finish the latest release.

infecto 10/29/2025|||

I too have tried them all and have settled with Cursor being the best. That said I see the current space split between folks like me who know generally what I want built and appreciate a tool that helps me get to goal quicker and on the otherwise of the spectrum, folks who want the tool to orchestrate most of the engineering. I have no opinion on which is better but for me I sit on the first camp. In that camp Cursor is by far the best tool.

saberience 10/30/2025|||

I used Cursor for the total of one day (paid for a year subscription), discovered Claude Code later that day and havent opened Cursor since.

Note, later I started using Codex and now Codex is my daily driver, Claude Code for problems where Codex fails (not many), and again Cursor is never used.

They were the first mover but Codex (in my opinion) blows Cursor up into 1000 tiny pieces. It's just so, so much better.

psygn89 10/29/2025|||

Yep, it just works seamlessly. Sure, it hangs sometimes, but their UI allows you to retry or undo changes to an earlier point in the conversation easily. The autocompletion is nice as well and pretty satisfying to tab through the small and menial things when refactoring.

rtfeldman 10/29/2025|||

> I’ve tried all the popular alternatives (Copilot, Claude Code, Codex, Gemini CLI, Cline)

Can't help but notice you haven't tried Zed!

ramon156 10/29/2025||

You tried Claude and still prefer cursor?

solarkraft 10/29/2025|||

Absolutely. CC can be tuned to not do too much crap on its own, but even with the new extension its IDE integration and multi thread management are still significantly worse, as is its status reporting, which I find to be very important.

Also, somehow magically, I’ve found Cursor’s Auto mode to be significantly faster than the specific models I’ve tried, Claude being among them.

infecto 10/29/2025||

Auto is pretty amazing and I think most folks that have issues or complain about cost are simply not using Auto.

enraged_camel 10/29/2025|||

Auto is only good for trivial stuff at this point. It is quite subpar at everything else. Th is is probably because it almost always defaults to Claude Sonnet 3.5 (which you can tell if you ask the agent to identify itself and tell you its version), and that is pretty outdated.

infecto 10/29/2025||

Again it goes back to what your workflow is. I don’t think trivial is the right word. I use auto to write fairly advanced code but I do it in bite size chunks or relatively bite size. So thinking function level or a couple of interdependent functions ruins being written.

I would agree it is not as good on doing lengthy work where it’s taking design all the way through implementing a feature in a single shot but trivial is not a good description.

I also don’t think you’re right. 3.5 was recently deprecated and even before then, Cursor has been hitting rate limits with Anthropic. Auto is as much a token cost optimization as it is a rate limit optimization.

lubujackson 10/29/2025|||

Auto had a big improvement a few weeks ago (around when pricing changed)

infecto 10/29/2025||

If a few weeks is months I would agree I think the change to Auto was 2-3+ months ago when they moved to charging named models and higher limits on Auto.

infecto 10/29/2025|||

Absolutely. I actually don’t understand the preference folks have for Claude code. I don’t find it that powerful. That said, I think some of it comes down to preference and work context.

OsrsNeedsf2P 10/29/2025||

One thing no competitor is serious on is average response completion time. Cursor lapped everyone there

srush 10/29/2025|

There are lots of good models we like here. But we agree that getting the right point on the smart+fast graph can make agentic coding feel really good.

(Cursor researcher)

nu11ptr 10/29/2025||

I love Cursor. I've tried Copilot/Claude/etc. but keep coming back to Cursor. I just want to work, and Cursor tab complete is dang accurate, esp. for refactoring tasks.

Sammi 10/29/2025|

I tried going back to VS Code + Copilot a month ago. I only lasted 4 days because it was to bad. It was super slow and gave poor suggestions, but mostly it just flat out did not suggest anything. Cursor feels snappy in comparison and the suggestions are more often than not useful. The most annoying thing about Cursor tab complete, is that it is so fast that when I am doing something unusual then it will keep on jumping in with useless suggestions. They have a snooze function for this though.

WanderPanda 10/29/2025||

Damn TIL, I always used > Cursor: disable completions and forgot to turn it on again I need to try snooze then!

simonw 10/29/2025||

Here's the Composer 1 pelican riding a bicycle: https://static.simonwillison.net/static/2025/cursor-1-pelica...

jeffnv 10/29/2025|

honestly better than I expected

bn-l 10/29/2025||

Nah. That ain’t a good pelican.

neuronexmachina 10/29/2025|

For anyone else who was wondering, it looks like the within-Cursor model pricing for Cursor Composer is identical to gemini-2.5-pro, gpt-5, and gpt-5-codex: https://cursor.com/docs/models#model-pricing

($1.25 input, $1.25 cache write, $0.13 cache read, and $10 output per million tokens)

lubujackson 10/29/2025|

I'm curious if their near-term expectation is that this is be better than these models or is this a model they tend to use in Auto mode, or if the focus is really if you want speed...? I guess my question is why would I actively chose this over Auto?

More comments...