Comment by toraway

8 days ago

Huh? AI labs are routinely spending millions to billions to various 3rd party contractors specializing in creating/labeling/verifying specialized content for pre/post-training.

This would just be one more checkbox buried in hundreds of pages of requests, and compared to plenty of other ethical grey areas like copyright laundering with actual legal implications, leaking that someone was asked to create a few dozen pelican images seems like it would be at the very bottom of the list of reputational risks.

3 comments

toraway

red75prime 8 days ago

How do you think who's in on that? Not only pelicans, I mean, the whole thing. CEOs, top researchers, select mathematicians, congressmen? Does China participate in maintaining the bubble?

I, myself, prefer the universal approximation theorem and empirical finding that stochastic gradient descent is good enough (and "no 'magic' in the brain", of course).

usefulposter 7 days ago
Well, since we're all talking about sourcing training material to "benchmaxx" for social proof, and not litigating the whole "AI bubble" debate, just the entire cottage industry of data curation firms:
https://scale.com/data-engine
https://www.appen.com/llm-training-data
https://www.cogitotech.com/generative-ai/
https://www.telusdigital.com/solutions/data-for-ai-training/...
https://www.nexdata.ai/industries/generative-ai
---
P.S. Google Comms would have been consulted re putting a pelican in the I/O keynote :-)
https://x.com/simonw/status/1924909405906338033
- red75prime 7 days ago
  
  Cool. At least they are working across the board and benchmaxing random things like the theory of mind.