Comment by nostrademons
1 day ago
That's sorta what MetaBrainz did - they offer their whole DB as a single tarball dump, much like what Wikipedia does. I downloaded it in the order of an hour; if I need a MusicBrainz lookup, I just do a local query.
For this strategy to work, people need to actually use the DB dumps instead of just defaulting to scraping. Unfortunately scraping is trivially easy, particularly now that AI code assistants can write a working scraper in ~5-10 minutes.
I mean this AI data scrapper would need to scan and fetch billions of website
why would they even care over 1 single website ??? You expect instiution to care out of billions website they must scrape daily
This is probably the reason. It’s more effort to special case every site that offers dumps than to just unleash your generic scraper on it.
the obvious thing would be to take down their website and only have the DB dump.
if thats the useful thing, it doesnt need the wrapper