← Back to context

Comment by jxjnskkzxxhx

5 days ago

You haven't addressed my question at all. If you think estimating 1e-1000 is a problem, then estimating 1e+1000 shouldn't be. But one is just the inverse of the other. What's the problem with this argument, you haven't answered it at all.

> If you think estimating 1e-1000 is a problem, then estimating 1e+1000 shouldn't be.

They're both problems. If you want to estimate 1e1000 via sampling individual points, then you need at least that order of magnitude of samples. If all of your data points fall in one class, then it doesn't matter what you're trying to calculate from that.

As I said: "If you're inverting the ratio we're estimating, then instead of estimating the value at 0, we're estimating it at infinity (or, okay, if you want to use Laplace smoothing, then a million, which is far too low)."

  • > If you want to estimate 1e1000 via sampling individual points, then you need at least that order of magnitude of samples.

    Ok so if the ratio is 1/2 how many samples do you need?

    • I mean, yes, you can estimate this for low dimension. It's a bad idea given how slow the convergence is, but you can do it.

      My entire point is that this becomes infeasible very quickly for numbers that are not all that big.

      5 replies →

Do you know how to program? It’s super easy to write a very simple rejection-based program to estimate the volume of a hypersphere (you can do it in <10 lines with numpy). Try it yourself for dimensionality 50 and see how long it takes before the estimate rises above precisely 0

  • Read my conversation with the other person. If you sample N times you'll be able to put a bound on the ratio, and that is the same regardless of dimension. To your example: if the ratio between sets in D=50 is 1e-50 and you ask how long it takes that your estimate bound doesn't contain zero, that will take a long time. Now if I ask you to estimate the ratio between the 2D circle and the 2D square with 50 decimal place, it will take the same time. Therefore, dimension doesn't enter here. This is a general property of Monte Carlo.