Comment by fastball

16 days ago

Yes, that is in fact how models get better at coding.

Such a ridiculous stance: "I want LLMs to code for me, but I want them to be trained on other people's code, not mine, duh".

20 comments

fastball

raincole 16 days ago

> "I want LLMs to code for me, but I want them to be trained on other people's code, not mine, duh".

Who ever said that? Have you actually heard that from your fellow programmers in real life?

If the code I wrote actually made even the slightest discernible difference in LLMs I'd be so honored. But it won't happen, as it's just 0.00001% of all the training data.

fastball 16 days ago
Real life? Most not. Hacker News? Absolutely. Literally the comment I am replying to.
> But it won't happen, as it's just 0.00001% of all the training data.
Are you familiar with Tragedy of the Commons?
- timeon 16 days ago
  
  Tragedy of the Commons is just an analogy - so not the fact.
  
  1 reply →

oefrha 16 days ago

Sounds good? They can pay for code they want to train on. There are plenty of companies sending me offers to code training materials for them for $50-100/hr. Don’t expect to charge me an arm and a leg for inference and then also train on my code.

dmix 16 days ago
There are already opt out buttons for training in Cursor and Claude Code… if you don’t want it then turn it off. If it was worth enough money to them they would offer a monetary incentive like discounts but none of them have yet
- Ritewut 16 days ago
  
  They are talking about the millions of lines of code they stole to make the product in the first place and I'm sure you know that.
  
  1 reply →
fastball 15 days ago

This just makes the inference more expensive for you?

timeon 16 days ago

How about: "I do not want it to code for me or anyone if it steals from someone."

xp84 16 days ago

Interesting how our generation which grew up using Napster now has so many intellectual property extremists. By this logic, even humming a tune you heard on the radio is theft.

ergocoder 16 days ago

Ok, now that you mentioned it, I actually want that.

davebren 16 days ago

Who are you quoting?

bel8 16 days ago
It's a common sentiment. An example from few hours ago: https://news.ycombinator.com/item?id=48558954
> I have absolutely zero interest in free. I honestly don't think I'm even remotely in the same demographic as people using free tiers / models. I want to pay. I don't want my data used for training...
They want to use LLMs trained on others code but don't want to contribute with their own.
Not casting judgement, just pointing out.
- skissane 16 days ago
  
  It makes sense from a business perspective-SaaS firms value the ability of coding agents to accelerate development, but also worry the models will learn the secret sauce of their business and destroy its moat. So their desire to contractually exclude training on their data has some logic to it.
  (Disclaimer: Not speaking for or about my current employer, just a general industry observation.)
- davebren 16 days ago
  
  I don't really use LLMs myself, but if someone wants to have any kind of software business then having the models trained on their products isn't ideal.

arcticfox 16 days ago

I mean, AlphaZero et Al start from zero. I learned writing my own code except for documentation and some textbooks.

fastball 13 days ago

Fair point, an AlphaZero of code would be very interesting indeed.

ThouYS 16 days ago

This is the correct take