Comment by jxjnskkzxxhx

3 months ago

> That's exactly it, though.

Ok, so instead of estimating B/A=1/100 just estimate A/B=100. What's the problem with this argument?

13 comments

jxjnskkzxxhx

My point is that we're not just talking about A/B of 100, but rather A/B of 10000000000000 or more. If you're inverting the ratio we're estimating, then instead of estimating the value at 0, we're estimating it at infinity (or, okay, if you want to use Laplace smoothing, then a million, which is far too low).

There are situations where you would use rejection sampling because generating 100x more random numbers is much cheaper than doing the complex calculations required to accurately model the space in question. Maybe 50 dimensions (and the 13 orders of magnitude difference) isn't big enough to raise those concerns. If we instead talk about 100 dimensions, then we're dealing with a difference of 40 orders of magnitude, and if you tell me that still doesn't matter, then I don't know what to say.

jxjnskkzxxhx 3 months ago
You haven't addressed my question at all. If you think estimating 1e-1000 is a problem, then estimating 1e+1000 shouldn't be. But one is just the inverse of the other. What's the problem with this argument, you haven't answered it at all.
- vitus 3 months ago
  
  > If you think estimating 1e-1000 is a problem, then estimating 1e+1000 shouldn't be.
  They're both problems. If you want to estimate 1e1000 via sampling individual points, then you need at least that order of magnitude of samples. If all of your data points fall in one class, then it doesn't matter what you're trying to calculate from that.
  As I said: "If you're inverting the ratio we're estimating, then instead of estimating the value at 0, we're estimating it at infinity (or, okay, if you want to use Laplace smoothing, then a million, which is far too low)."
  
  8 replies →
- jdhwosnhw 3 months ago
  
  Do you know how to program? It’s super easy to write a very simple rejection-based program to estimate the volume of a hypersphere (you can do it in <10 lines with numpy). Try it yourself for dimensionality 50 and see how long it takes before the estimate rises above precisely 0
  
  1 reply →