Comment by iamalizard

5 hours ago

What about a distributed way of doing search, does that exist?

Different people/bots scrape the net and add it to a distributed database optimized for search.

Each query could cost a crypto micropayment to avoid DDoS. Or maybe a slightly larger payment to download the whole database so you can use it privately or create a competing centralized or decentralized search.

Yes, we hate crypto, but it seems useful here. It's bad if 1 entity can gatekeep both the database and access to it, no matter how non-evil they seem now.

We might even index torrents, use speech-to-text for music, movies, video clips and other things like that. So you'll search for a phrase from a movie and it will be there even though no one mentions it on any website.

A couple of issues I can think of with that decentralized approach:

* copyright - fuck it, it's decentralized, it can index whole books, maybe partnering with Anna's Archive or LibGen. Maybe have a copyright-respecting database and another one that doesn't respect it if you foresee the man coming down on the project. Maybe the results from the DB that doesn't respect copyright is merged at query-time with the one that does. Or maybe, the DB that doesn't respect copyright is just a superset of the copyright-respecting DB. I don't know how easy it would be to simultaneously search more than 1 DB.

* privacy - it could run over Tor or at least allow people to access it via Tor. The privacy of the cryptocurrency also seems doable - we have Monero and other private coins but I'm not sure how easy it would be to implement private micropayments with these.

* spam, intentionally wrong archives/crawls - pay the people who submit sites something so they financial motivation to not lie. Some consensus-based reward mechanism could be used, not sure which one

* moderation, illegal content - we don't care about copyright but likely don't want real CSAM, real animal abuse and other obviously awful content. Rewards should also be able to be used somehow for moderators or for people flagging content. We might even have a decentralized way to flag/tag content for anything at all - "AI generated" or "human generated", "small web", "uses Cloudflare", etc..

* how the distributed database actually works, how searching it works, who connects to whom when making a query and so on. I hope there are smart people with knowledge on such systems (not me lol) who can shed some light on whether it's possible and how.