Comment by ekipan

10 hours ago

I forget the context but the other day I also learned about Snowflake IDs [1] that are apparently used by Twitter, Discord, Instagram, and Mastodon.

Timestamp + random seems like it could be a good tradeoff to reduce the ID sizes and still get reasonable characteristics, I'm surprised the article didn't explore there (but then again "timestamps" are a lot more nebulous at universal scale I suppose). Just spitballing here but I wonder if it would be worthwhile to reclaim ten bits of the Snowflake timestamp and use the low 32 bits for a random number. Four billion IDs for each second.

There's a Tom Scott video [2] that describes Youtube video IDs as 11-digit base-64 random numbers, but I don't see any official documentation about that. At the end he says how many IDs are available but I don't think he considers collisions via the birthday paradox.

[1]: https://en.wikipedia.org/wiki/Snowflake_ID

[2]: https://youtu.be/gocwRvLhDf8

Getting the entire universe to agree on a single clock for creating timestamps sounds absurdly difficult. Probably impossible?

  • You don't need the universe to agree. You need your ID system to agree within a reasonable margin of error.

  • "Agreement" of time is probably nonsense, yeah. I realized after posting so I edited in the parenthetical, but as [3] notes, locality probably makes this less of a real issue.

    Apparently with the birthday paradox 32 bit random IDs only allow some tens of thousands per second before collision chance passes 50%. Maybe that's acceptable?

    [3]: https://news.ycombinator.com/item?id=47065241