Comment by frogperson
2 days ago
I think it would be really cool if someone built a reverse proxy just for dealing with these bad actors.
I would really like to easily serve some markov chain non-sense to Ai bots.
2 days ago
I think it would be really cool if someone built a reverse proxy just for dealing with these bad actors.
I would really like to easily serve some markov chain non-sense to Ai bots.
perhaps Iocaine [1] is what you're looking for. See the demo page [2] for what it serves to AI crawlers.
1. https://iocaine.madhouse-project.org/
2. https://poison.madhouse-project.org/
For images you have stuff like https://nightshade.cs.uchicago.edu/whatis.html
This site blocked me right away, seems quite agressive
Seems like a good way to waste tons of your bandwidth. Almost every serious data pipeline has some quality filtering in there (even open-source ones like FineWeb and EduWeb). And the stuff Iocaine generates instantly gets filtered.
Feel free to test this with any classifier or cheapo LLM.