Comment by davebren
17 days ago
> "In its IPO filing, the company had said Cursor's access to developers' data, including coding requests and design decisions, could help improve its AI models such as Grok."
They're all stealing your IP and selling it back to your competitors in the form of tokens.
Yes, that is in fact how models get better at coding.
Such a ridiculous stance: "I want LLMs to code for me, but I want them to be trained on other people's code, not mine, duh".
> "I want LLMs to code for me, but I want them to be trained on other people's code, not mine, duh".
Who ever said that? Have you actually heard that from your fellow programmers in real life?
If the code I wrote actually made even the slightest discernible difference in LLMs I'd be so honored. But it won't happen, as it's just 0.00001% of all the training data.
Real life? Most not. Hacker News? Absolutely. Literally the comment I am replying to.
> But it won't happen, as it's just 0.00001% of all the training data.
Are you familiar with Tragedy of the Commons?
2 replies →
Sounds good? They can pay for code they want to train on. There are plenty of companies sending me offers to code training materials for them for $50-100/hr. Don’t expect to charge me an arm and a leg for inference and then also train on my code.
There are already opt out buttons for training in Cursor and Claude Code… if you don’t want it then turn it off. If it was worth enough money to them they would offer a monetary incentive like discounts but none of them have yet
2 replies →
This just makes the inference more expensive for you?
How about: "I do not want it to code for me or anyone if it steals from someone."
Interesting how our generation which grew up using Napster now has so many intellectual property extremists. By this logic, even humming a tune you heard on the radio is theft.
Ok, now that you mentioned it, I actually want that.
Who are you quoting?
It's a common sentiment. An example from few hours ago: https://news.ycombinator.com/item?id=48558954
> I have absolutely zero interest in free. I honestly don't think I'm even remotely in the same demographic as people using free tiers / models. I want to pay. I don't want my data used for training...
They want to use LLMs trained on others code but don't want to contribute with their own.
Not casting judgement, just pointing out.
2 replies →
I mean, AlphaZero et Al start from zero. I learned writing my own code except for documentation and some textbooks.
Fair point, an AlphaZero of code would be very interesting indeed.
This is the correct take
No one is stealing anything here.
Cursor users are willfully providing it by using their product. Not unlike uploading a personal photo or video to social media -- that's not yours anymore. You gave all rights away when you put it on their servers.
> Not unlike uploading a personal photo or video to social media -- that's not yours anymore. You gave all rights away when you put it on their servers.
99% of users are unaware of that, so it’s false to say they "are willfully providing it".
> Not unlike uploading a personal photo or video to social media
I've granted them a limited license to use it.
> that's not yours anymore.
Not by any definition in the contract or in law is this true.
> You gave all rights away when you put it on their servers.
I gave away some rights. I also got something in return. Attention. And at the end of the day I'm completely entitled to turn around and sell copies of this work for profit. The only thing I can't do is sell an /exclusive/ license because that is no longer available.
None of this provides any implication for people who upload code to their own websites. Which these rapacious LLMs bots happily index, sometimes to the extent they actually crush the site, or create unusual costs for the owner.
Finally none of these LLM companies tell you where the source came from. Whether it is copyrighted, whether ownership rights are retained, or whether the code can be used publicly or not, and if so, which license it's covered by.
You're using the lens of social media contracts to understand something far larger and more important. It's lead you to some bizarre conclusions and huge oversites.
What if someone steals my work and then uploads it to facebook and claims it as their own? Do the rights no longer exist because it got uploaded to Meta?
Do you think that everyone using it and their employers are aware that they are giving their competition the ability to copy straight from their codebase when they ask it to replicate their product?
most serious employers have data retention agreements.
3 replies →
I work at a 20k-employee company with footprint in tech. A couple of weeks ago a decision was made to not renew Cursor license. This week it was made public and has been greeted with groans. Some employees have been publicly saying how ineffective and unproductive they are going to become due to this decision- which actually strikes me as really naive.
Personally I like to use a stable IDE. I used Cursor for a couple of days and then went back to VS Code, largely due to Cursor pushing agentic first approach with V3 update.
Most people are not good at their jobs and IDEs like Cursor let bad SWEs feel more productive because they are committing more code.
Indeed, according to Cursor leaderboard there were folks committing 200k-300k lines of code every month. And here I am intentionally slowing myself down manually approving every AI generated code change and terminal command.
> They're all stealing your IP and selling it back to your competitors in the form of tokens.
The users of those tools are stealing too. The model is trained on free software licensed under specific terms and the output of a prompt will strip those licenses and their terms.
Quit posting trashing. This isn't theft.
It's not theft in the same way a con artist convinces you to give him all your money. The people using it or their employers just don't realize any competitor will be able to ask the LLM to replicate their product and it will copy the codebase they uploaded to them.
This isn’t how training works.
11 replies →