Comment by zahlman
1 day ago
The third-party hit-counting service I use implies that I'm not getting any of this bot scraping on my GitHub blog.
Is Microsoft doing something to prevent it? Or am I so uncool that even bots don't want to read my content :(
I'm interested in that service and how it works. Link?
It is https://github.com/silentsoft/hits . It works by loading an SVG "shield" file (like the ones you see at the top of GitHub readmes all the time) from their server from a unique URL (you just choose one when you write/render your HTML). The server, implemented in Java, just counts hits to each URL in a database and sends back the corresponding SVG data. There's also a mini dashboard website where you can check basic stats for a given URL (no login required, everyone's hits-per-day stats are just public) and preview styling options for the SVG. For example, for my most recent blog post https://zahlman.github.io/posts/2025/12/31/oxidation/, I configured it such that you can view the stats via https://hits.sh/zahlman.github.io+oxidation/ (note that the trailing slash is required).
(The about section on GitHub bills the project as "privacy-friendly", which I would say is nonsense as these dashboards are public and their URLs are trivially computed. But it's also hard to imagine caring.)
They're probably not downloading every svg each time they scrape the site. Probably focused on scraping the text.
1 reply →