← Back to context

Comment by thaumasiotes

1 year ago

> If we didn’t care about generating valid UUID v4s, we could just generate 2^128 numbers, scramble them, convert the resulting bits into a hex string, and intersperse some dashes.

You can do that anyway. You'd only need the twiddling if you wanted to limit the amount of numbers you generate to 2^122. Since you're willing to generate 128 bits:

    // get encrypted value
    uint128 bits = encipher(seed);
    // clear zero bits
    bits &= 0xFFFFFFFF FFFF 4FFF BFFFFFFFFFFF; // instead of 4 and B, you can use 0 and 3
    // set one bits
    bits |= 0x00000000 0000 4000 800000000000; // these have to be 4 and 8

But since you're generating more numbers than you need, you're going to end up with 64 copies of each UUID in your final stream. This won't matter because no one will ever be able to notice, as long as your faux search functions avoid pointing it out.

Exercise for further development: modify substring search so that it follows the expected behavior of finding matches in order within the page. [I don't recommend attempting this.]

I considered this, but I'd just be so unsatisfied if I finished scrolling through all 2^128 rows and realized I'd seen some duplicates!

  • I liked the approach movpasd suggested: https://news.ycombinator.com/item?id=42346076

    With a linear algebra library, you can guarantee that you've found the next, or the previous, match in sequence. I don't know what the state of the art is for fast linear algebra in javascript, though.

    (The matrix approach also has the advantage that, when your full-text search problem has 2^115 solutions, you can compute the one you want, the next one after some index, without having to compute them all.)

    • I need to learn more linear to be able to appreciate this! Gonna do some reading.

      very fun to have received so many pointers here, hopefully I'll be able to do a follow up blog once I've finally let people find all the good UUIDs