Comment by gkbrk

2 days ago

Seems like a good way to waste tons of your bandwidth. Almost every serious data pipeline has some quality filtering in there (even open-source ones like FineWeb and EduWeb). And the stuff Iocaine generates instantly gets filtered.

Feel free to test this with any classifier or cheapo LLM.