Comment by re
1 year ago
The full text search is a little confusing because it doesn't actually search them in order, though it appears to at first. And if you click "next" a few times and then "prev" the same number of times, you don't necessarily end up back at the same UUID you were at before. It's a neat-seeming trick though.
It’s an interesting question whether that could be fixed. I think the answer is Yes. If the author didn’t do any scrambling, and just displayed UUIDs in numeric order, then it’s trivial to enumerate search results in order. Likewise, if you do something like adding a constant mod 16 to each hex digit, you could do the same thing when you generate UUIDs matching a substring. So the question becomes whether you could find something like that that gives a sufficiently convincing illusion of entropy but is still reversibile when you hold a subset of the digits constant. And it seems like it should be.
FWIW I am super interested in this question but feel like I don't know how to derive a satisfying answer, maybe because the one of my goals here (add "enough" entropy) is a real fuzzy "I know it when I see it" sort of thing.
But I'm gonna try to get a few more crypto-knowledgeable friends to chat with me about this and write up what I learn!
You've sniped me and I'm going to try and tackle this over the weekend. What's the best way to exchange things for you? Also, have you explored modular multiplicative inverses at all?
1 reply →
My first thought was to use linear transformations over Z_2 as a field, as that would create a natural interpretation of fixing certain bits as taking a linear subspace. Interestingly this leads to the property that XOR is preserved.
I implemented this in a very quick and hacky way for 32 bits. I generated a random boolean matrix M invertible in Z_2. To turn an input number x into a corresponding number y in an N-bit space, I convert x to binary and turn that into a vector of 1s and 0s, then multiply it by that randomized matrix to get y. Here are the first few y's corresponding to x=0, 1, ...:
00000000000000000000000000000000
0xx0x000xx00x0xx000xxx00xxx0x000
x0xx000xxxx0xxx00x000xxx0xx0xx0x
xx0xx00x00x00x0x0x0xx0xxx0000x0x
x0xx0x00xxx0xx0x00xx0x0xx0xx00x0
xx0xxx0000x00xx000x0x00x0x0xx0x0
00000x0x000000xx0xxx00x0xx0xxxxx
0xx0xx0xxx00x0000xx0xxx000xx0xxx
...
(Hoping the non-monospace font doesn't ruin the alignment too much.)
Which looks... random-ish? I expect that turning these into UUIDs may result in more random-looking sequences.
Not sure how to solve the problem of search with this, but the hope would be that the linear structure gives you what you need, since fixing bytes on UUIDs should correspond to considering linear subspaces of the y vectors. Perhaps this can also be used to apply lexicographic order on the corresponding x vectors (i.e.: ordering the indexes), so that you could jump to each UUID matching the search in order.
6 replies →
Would it be easier if only the lowest bits of the index contributed to the "entropy"? Like if the 2^16th UUID was the one right after the first and n + 2^16 was the one after the nth. You wouldn't notice the pattern but it'd be easier for the computer to handle. I guess if you were searching for a substring you might notice the ones surrounding the matches look almost the same from match to match...
Let me cook something up for you. It's an interesting puzzle.
I think you can get pretty far, if you compromise on your entropy: it only has to look random, not actually be random. (I mean it doesn't have to be cryptographically secure randomness.)
Yes, it's possible with certain restrictions on the function. Here's an example:
This is a T-function, a function where the Kth bit only depends on the K-1 lower bits. prf() is a pseudo-random function that can be anything you like as long as it's another T-function. A standard LCRNG like (48271 * x) % 2147483647 works just fine for instance.
You can invert this by just running the function N times to test each bit from LSB to MSB against the result. If you know certain bits, you don't have to run those tests and you can order the matching UUIDs by the values of the K unknown bits from 0 to 2^(N - K) - 1.
You also don't need to keep track of the position with this method either, only a stopping point. Unlike the feistel network, this function also produces a full cycle when iterated as x_i+1 = f(x_i), only repeating after all numbers are produced. That means you could run this without a counter at all, just generating until the first value is produced again, for any bit length.