Comment by ralferoo

1 day ago

BTW I realised I didn't address why those bits are necessary - actually, while it might seem you are increasing randomness with more bits and so reducing the risk of collisions, that's not necessarily true.

The old schemes generated numbers that weren't uniformly distributed across the 128-bit space as they were intentionally biased in certain ways, such as time [0] and MAC addresses [1]. This means that most of the IDs generated in previous schemes would have many bits in common, and so the UUIDs that had been generated were not uniformly distributed across that 128-bit space [2] and so if you just used the whole 128-bits for random data, but didn't use those extra bits to avoid conflicts with the previous schemes, then random IDs that happened to be valid in the previous schemes would be more likely to collide.

Of course, this only matters if the properties of globally unique matter to you. For a closed system with a guaranteed scope, sure who cares? But given that the extra randomness doesn't add any useful value beyond a certain threshold, you might as well use a UUID because you don't know what that identifier might end up being used for in the future, plus you can use off-the-shelf systems to generate them.

[0] Ironically, future proofed time fields with many bits are more likely to be non-linearly distributed - e.g. the original version 0 UUID supported timestamps from 1582AD to 5236AD but was only used from 1987 for around a decade.

[1] With certain manufacturers of network cards massively more popular than others, their MAC address prefixes showed up significantly more frequently, and there were privacy concerns were you could correlate between UUIDs generated on a single machine, and sometimes infer machines that might be on the same network because they had similar MAC addresses and so the cards were probably all from the same manufacturing batch.

[2] Which is fine within the scope of UUIDs as they are still very likely to be globally unique, so it doesn't really matter if bits are wasted in this scheme