Comment by pfraze

2 years ago

The network can be reduced to three primary roles: data servers, the aggregation infrastructure, and the application clients. Anybody can operate any of these, but generally the aggregation infra is high scale (and therefore expensive).

So you can have anyone fulfilling these roles. At present there are somewhere around 60 data servers with one large one we run; one aggregator infra; and probably around 10 actively developed clients. We hope to see all of these roles expand over time, but a likely stable future will see about as many aggregator infrastructure as the Web has search engines.

When we say an infrastructure takedown, we mean off the aggregator and the data server we run. This is high impact but not total. The user could migrate to another data server and then use another infra to persist. If we ever fail (on policy, as a business, etc) there is essentially a pathway for people to displace us.

8 comments

pfraze

Vinnl 2 years ago

Why would anyone run their own aggregator? (i.e. if you run a search engine, you can show contextual ads to recoup your investment and then some.)

Sorry about going off-topic, I realise it's only tangentially about labelling.

pfraze 2 years ago
We'll let you know when we figure out why we're doing it.
- Vinnl 2 years ago
  
  I guess I should have asked about anyone else :) I know why you would - you're planning to sell services around Bluesky [0], and thus need Bluesky itself to be working.
  But if it's already working (because you're running an aggregator), there doesn't seem much reason for anyone else to run one? In other words, isn't there a significant risk that there will be fewer aggregators than there are search engines, i.e. just a single one?
  [0] https://bsky.social/about/blog/7-05-2023-business-plan
  
  3 replies →

bobajeff 2 years ago

Would it be possible to do a p2p aggregator (Like yacy but for atprotocol)?

pfraze 2 years ago

It might be worth trying, but essentially what you're trying to do is cost/load sharing on the aggregation system. You could do that by computing indexes and sharing them around, to reduce some amount of required compute, and I suspect we'll be doing things like that. (For example, having the precomputed follow graph index as a separate dataset.) However if you're trying to replace the full operational system, I think the only kind of load sharing that could work would require federated queries, which I consider a pretty unproven concept.