Comment by beardyw
1 day ago
Just a stupid question, but why not append the date, even in seconds as hex. It's just a few bytes and would guarantee that everything OK now will be OK in the future?
1 day ago
Just a stupid question, but why not append the date, even in seconds as hex. It's just a few bytes and would guarantee that everything OK now will be OK in the future?
You can just use a different UUID variant which includes timestamp data instead (e.g. v1 or v7), there are also variants which include the MAC address.
Might as well just use uuidv7
But since the randomness is obviously borked, it was much better to use v4 and find out about it after just 15K records instead of X million records later.
yeah, any sort of additional semi-random data could've helped prevent this, I'm sure. That, however, is also kind of the idea of UUIDv4, it has lots of randomness and time built in already.
UUID v4 consists of only random bits, no timestamp info.
Wrong, they have 122 random bits out of 128. The other six bits are to say “hello I am a UUIDv4”.
oh, interesting, I didn't know that and this could possibly be part of the problem perhaps depending on what's used as the seed.
But surely hashing the date still allows for a future collision. Leaving the date as is means it will never collide after that one second has passed.
You could do that, but now you're like 90% of the way to maintaining a monotonically increasing number you that could just use as a unique ID instead without any randomness required (and without the additional 128 bits for collision protection via the appended UUID).
So your ID would take like 64 bits for the time unique to the nanosecond plus 128 bits for the UUIDv4 = 192 bits which is a pretty beefy sized ID.
(I know you said just append a second count but you will want a predictable/fixed size for your data structure in pretty much any use case so need to decide the upper bound and precision ahead of time)
Especially when the alternative is a 128 bit UUIDv4 that's guaranteed unique with proper usage of high quality RNG or a 128 bit UUIDv7 if you have a clock (that's needed for your method anyway) that will be much more forgiving of a flaky source of randomness and more sortable than your monotonic-ish ID for 1/3 fewer bits.
Basically, stapling anything onto a UUID is a waste of space if you don't trust it, so might as well drop it completely and use a significantly smaller source of randomness at that point.
UUID 7 does not hash the date. It uses 48 bits to store a millisecond resolution timestamp. This allows you to sort uuids by time.
> but why not append the date
And use uuid v5 to hash it :)