Comment by abeppu
6 months ago
> which both (1. gives you an unlimited supply of noncopyrighted training data and (2. handily sidesteps the issue of AI-contaminated training data.
I think these are both basically somewhere between wrong and misleading.
Needing to generate your own data through actual experience is very expensive, and can mean that data acquisition now comes with real operational risks. Waymo gets real world experience operating its cars, but the "limit" on how much data you can get per unit time depends on the size of the fleet, and requires that you first get to a level of competence where it's safe to operate in the real world.
If you want to repair cars, and you _don't_ start with some source of knowledge other than on-policy roll-outs, then you have to expect that you're going to learn by trashing a bunch of cars (and still pay humans to tell the robot that it failed) for some significant period.
There's a reason you want your mechanic to have access to manuals, and have gone through some explicit training, rather than just try stuff out and see what works, and those cost-based reasons are true whether the mechanic is human or AI.
Perhaps you're using an off-policy RL approach -- great! If your off-policy data is demonstrations from a prior generation model, that's still AI-contaminated training data.
So even if you're trying to learn by doing, there are still meaningful limits on the supply of training data (which may be way more expensive to produce than scraping the web), and likely still AI-contaminated (though perhaps with better info on the data's provenance?).
No comments yet
Contribute on Hacker News ↗