Top
Best
New

Posted by jg2007 9/14/2025

Has Cursor started training models on private user data?(cursor.com)
4 points | 4 comments
jg2007 9/14/2025|
Cursor recently published a new blog outlining how they train models. Interestingly, the blog does not clarify how they handle opt-out user data and/or business user data -- exact phrasing: "[cursor's] model runs on every user action, handling over 400 million requests per day. As a result, we have a lot of data about which suggestions users accept and reject. This post describes how we use this data to improve Tab using online reinforcement learning."

As a matter of fact, the wording sounds like all cursor user data (opt-in and opt-out alike) are being used.

Anyone knows what's going on behind the scenes?

NitpickLawyer 9/14/2025|
If you read the fineprint, they all say mostly the same variation on "we do not train foundational models on your data". That is not to say they won't train other models, or use signals to train other models. It's just the data that doesn't get copied to the training set.

And this makes sense. You train on your own data, and use the signals to know if your run was good or not.

reasonableklout 9/15/2025||
Don't they transparently say that they train models on your actions by default unless you opt-out as part of the install flow?
fithisux 9/14/2025|
That is why I use VScodium or Theia and Positron.

No AI features enabled.