Distributed ID Formats Are Architectural Commitments, Not Just Data Types

4 days ago (piljoong.dev)

I wrote an article comparing different GUID implementations and also prepared a clear spreadsheet with side‑by‑side implementation comparisons.

https://medium.com/@orefalo_66733/globally-unique-identifier...

Looking at your implementation, I like the clean split between shard, tenant, and sequence.

However, this results in a 160‑bit format, which does not fit natively in most databases, as they usually use the UUID type. I also find 60 bits of randomness to be low (ULID also uses 60).

Last point, using a GUID is not only for sharding. It is also important for protecting against predictability, which beyond the GUID structure, requires using the right approved crypto‑safe random generator.

> The old auto-increment IDs were totally fine—until suddenly they weren’t, because multiple shards couldn’t share the same global counter anymore.

> Their workaround was simple and surprisingly effective: they offset new IDs by a huge constant—roughly a billion. Old IDs stayed below the threshold, new IDs lived above it, and nothing collided. It worked surprisingly well, but it also taught me something.

So what was the fix? The new numbers are bigger? I need a little more detail.

> If your system is running on a single database with moderate traffic, auto-increment is still probably the best answer. Don’t overthink it.

If autoincrement is the simplest way to do things, but breaks if you evolve the system in any conceivable way, maybe autoincrement isn't the simplest way to do things.

Isn't that the point of the article?

The checksum idea is interesting, but why make it a tack-on at the end? Taking 20 random bits to use for a mandatory checksum seems like an interesting trade-off.

Epoch shift with 48-bit timestamp that has >12,000 years of range to get another 50 years of range is an amusing choice.

> ID formats aren’t just formats. They’re commitments.

Reading direct LLM output is highly cringeworthy.