← Back to context

Comment by AlotOfReading

3 hours ago

You haven't explained what the benefit is. There aren't "spaces we haven't formalized" because of the pigeonhole principle. There are M bits. You can generate every one of those 2^M values with any max cycle permutation.

What work is being offloaded from computers to people? It's exactly the same thing with more determinism and no logarithmic overhead.

> There aren't any "spaces we haven't formalized"

Suppose that space of N points is partitioned into M relevant subsets, for now we assume of the same size. Then random sampling hits each of those subsets in O(M log M) time, even if we don't know what they are.

This sort of partitioning is long talked about in the testing literature, with the idea you should do it manually.

> what work is being offloaded

The need to write that program for explicitly enumerating the space.

  • Just to avoid potential confusion, the claim is that this is a function that generates a random permutation:

        pub fn shuffle(g: *Gen, T: type, slice: []T) void {
            if (slice.len <= 1) return;
    
            for (0..slice.len - 1) |i| {
                const j = g.range_inclusive(u64, i, slice.len - 1);
                std.mem.swap(T, &slice[i], &slice[j]);
            }
        }
    

    And this is a function that enumerates all permutations, in order, exactly once:

        pub fn shuffle(g: *Gen, T: type, slice: []T) void {
            if (slice.len <= 1) return;
    
            for (0..slice.len - 1) |i| {
                const j = g.range_inclusive(u64, i, slice.len - 1);
                std.mem.swap(T, &slice[i], &slice[j]);
            }
        }
    

    Yes, they are exactly the same function. What matters is Gen. If it looks like this

    https://github.com/tigerbeetle/tigerbeetle/blob/809fe06a2ffc...

    then you get a random permutation. If it rather looks like this

    https://github.com/tigerbeetle/tigerbeetle/blob/809fe06a2ffc...

    you enumerate all permutations.