Comment by locusofself

3 days ago

Why do they "insert" even non-celebrity posts into each follower's timeline? That is not intuitive to me.

To serve a user timeline in single-digit milliseconds, it is not practical for a data store to load each item in a different place. Even with an index, the index itself can be contiguous in disk, but the payload is scattered all over the place if you keep it in a single large table.

Instead, you can drastically speed up performance if you are able to store data for each timeline somewhat contiguously on disk.

Think of it as pre-rendering. Of pre-rendering and JIT collecting, pre-rendering means more work but it's async, and it means the timeline is ready whenever a user requests it, to give a fast user experience.

(Although I don't understand the "non-celebrity" part of your comment -- the timeline contains (pointers to) posts from whoever someone follows, and doesn't care who those people are.)

  • Perhaps I misunderstanding, I thought the actual content of each tweet was being duplicated to every single timeline who followed the author, which sounded extremely wasteful, especially in the case of someone who has 200 million followers.

    • From the linked article: "Additionally, a reference to your post is 'fanned out' to your followers so they can see it in their Timelines."

      So not the content, just a sort of link to it.