Comment by b1085436

1 month ago

I strongly suspect that it is absolutely impossible to have an even remotely usable/useful "AI" trained on tiny datasets, and that instead of training only on ethical data, companies that want to sound ethical will use an extra post-training step for dirty foundation models to behave more ethically as if they'd only learned from ethical sources. I'd hate for this to become the norm, but I fear this is logically what annoucements like this one really mean. The difference in scale is so vast -- taking whatever you want from the entire internet -- vs hand-curated datasets with explicit authorisation and free to use. It's like trying to make a grain of sand gravitate around a marble in the playground, to mimic the moon around the Earth – won't work.

0 comments

b1085436

No comments yet

Contribute on Hacker News ↗