Comment by nickpsecurity
1 month ago
They mostly do that. They risked legal contamination by using Whisper-derived text and web text which might have gotchas. Other than that, it was a great collection for low-risk training.
1 month ago
They mostly do that. They risked legal contamination by using Whisper-derived text and web text which might have gotchas. Other than that, it was a great collection for low-risk training.
No comments yet
Contribute on Hacker News ↗