← Back to context

Comment by ars

11 hours ago

> Simply add random access times.

That doesn't work. Because the random times are uniformly distributed it's possible to remove it from the data by additional sampling. You do make it harder because you need a lot more data, but it's still possible to extract the signal, because the noise is uniform.

The interesting mitigation would be snapping I/O to a course clock.

You could then set it to hold the result until the next tick.

E.g. An I/O tick of 20ms, and it would only return on 20ms boundaries, then almost every SSD would look the same.

It would slow down the API a bit, but privacy has tradeoffs.

  • Probably still does not work. Assume a request takes X ms and let us look at what you will observe depending on where within a tick period it arrives.

    If it arrives anywhere from 0 ms to (20 - X) ms after a tick, it will complete before the next tick, so the measured duration will be between X ms and 20 ms. If it arrives later in the tick period, it will miss the next tick and have to wait an additional tick period, so the measured duration will be between 20 ms and (20 + X) ms.

    If you make N repetitions, you would normally see a spike of density 1 at X. With the 20 ms tick wait, you will see a uniform distribution of density 1/20 between X and (20 + X).

    You would have to perform each request and then return the result exactly 20 ms after it was received in order to mask the request duration. But that just creates a new target, your timers and queues to delay the response. Or making the load so high, that requests take more than 20 ms.

The random times don't have to be uniformly distributed. Though it's enough for attackers to know the distribution to de-noisify it.