Comment by tough
7 months ago
Someone suggested to me to apply a filter that serves .md or txt to bots/ai scrapers instead of the regular website, seems smart if it works but i hate it when i get captchas and this could end up similarly detecting non-bots as bots
maybe a view full website link loaded on js so bots dont see it idk
I would love to see most sites serve me markdown. I'd happily install a browser extension to mask me as a a AI bot scraper if it means I can just get the text without all the noise.
someone built a service for ai bots called pure.md its been a godsend to curl websites as markdown on the occasional where it doesnt work first time and works great for occasional use with the free tier
I have good news. It (almost) exists, it is called Gemini [0]
- [0] https://geminiprotocol.net/
Not news to me. I host my own gemlog there, but post rather infrequently.
Websites as gemtext would be even better than as markdown, but less likely to be fed to bots.
lol
me too tbh
someone pointed out you can enable by default reader mode on safar under settings but even then not all website’s pages are seeved as reader mode enabled pages