← Back to context

Comment by Voloskaya

5 days ago

Off topic, but am I the only one getting triggered every time I see a rationalist quantify their prediction of the future with single digit accuracy? It's like their magic way of trying to get everyone to forget that they reached their conclusion in completely hand-wavy way, just like every other human being. But instead of saying "low confidence" or "high confidence" like the rest of us normies, they will tell you they think there is 16.27% chance because they really really want you to be aware that they know bayes theorem.

Interestingly, this is actually a question that's been looked at empirically!

Take a look at this paper: https://scholar.harvard.edu/files/rzeckhauser/files/value_of...

They took high-precision forecasts from a forecasting tournament and rounded them to coarser buckets (nearest 5%, nearest 10%, nearest 33%), to see if the precision was actually conveying any real information. What they found is that if you rounded the forecasts of expert forecasters, Brier scores got consistently worse, suggesting that expert forecast precision at the 5% level is still conveying useful, if noisy, information. They also found that less expert forecasters took less of a hit from rounding their forecasts, which makes sense.

It's a really interesting paper, and they recommend that foreign policy analysts try to increase precision rather than retreating to lumpy buckets like "likely" or "unlikely".

Based on this, it seems totally reasonable for a rationalist to make guesses with single digit precision, and I don't think it's really worth criticizing.

  • Likely vs. unlikely is rounding to 50%. Single digit is rounding to 1%. I don't think the parent was suggesting the former is better than the latter. Even before I read your comment I thought that 5% precision is useful but 1% precision is a silly turn-off, unless that 1% is near the 0% or 100% boundary.

    • The book Superforecasting documented that for their best forecasters, rounding off that last percent would reliably reduce Brier scores.

      Whether rationalists who are publicly commenting actually achieve that level of reliability is an open question. But that humans can be reliable enough in the real world that the last percentage matters, has been demonstrated.

      4 replies →

    • The most useful frame here is looking at log odds. Going from 15% -> 16% means

      -log_2(.15/(1-.15)) -> -log_2(.16/1-.16))

      =

      2.5 -> 2.39

      So saying 16% instead of 15% implies an additional tenth of a bit of evidence in favor (alternatively, 16/15 ~= 1.07 ~= 2^.1).

      I don't know if I can weigh in on whether humans should drop a tenth of a bit of evidence to make their conclusion seem less confident. In software (eg. spam detector), dropping that much information to make the conclusion more presentable would probably be a mistake.

Would you also get triggered if you saw people make a bet at, say, $24 : $87 odds? Would you shout: "No! That's too precise, you should bet $20 : $90!"? For that matter, should all prices in the stock market be multiples of $1, (since, after all, fluctuations of greater than $1 are very common)?

If the variance (uncertainty) in a number is large, correct thing to do is to just also report the variance, not to round the mean to a whole number.

Also, in log odds, the difference between 5% and 10% is about the same as the difference between 40% and 60%. So using an intermediate value like 8% is less crazy than you'd think.

People writing comments in their own little forum where they happen not to use sig-figs to communicate uncertainty is probably not a sinister attempt to convince "everyone" that their predictions are somehow scientific. For one thing, I doubt most people are dumb enough to be convinced by that, even if it were the goal. For another, the expected audience for these comments was not "everyone", it was specifically people who are likely to interpret those probabilities in a Bayesian way (i.e. as subjective probabilities).

  • > Would you also get triggered if you saw people make a bet at, say, $24 : $87 odds? Would you shout: "No! That's too precise, you should bet $20 : $90!"? For that matter, should all prices in the stock market be multiples of $1, (since, after all, fluctuations of greater than $1 are very common)?

    No.

    I responded to the same point here: https://news.ycombinator.com/item?id=44618142

    > correct thing to do is to just also report the variance

    And do we also pull this one out of thin air?

    Using precise number to convey extremely unprecise and ungrounded opinions is imho wrong and to me unsettling. I'm pulling this purely out of my ass, and maybe I am making too much out of it, but I feel this is in part what is causing the many cases of very weird, and borderline associal/dangerous behaviours of some associated with the rationalists movement. When you try to precisely quantify what cannot be, and start trusting those numbers too much, you can easily be led to trust your conclusions way too much. I am 56% confident this is a real effect.

    • I mean, sure people can use this to fool themselves. I think usually the cause of someone fooling themselves is "the will to be fooled", and not so much that fact that they used precise numbers in the their internal monologue as opposed to verbal buckets like "pretty likely", "very unlikely". But if you estimate 56% it sometimes actually makes a difference, then who am I to argue? Sounds super accurate to me. :)

      In all seriousness, I do agree it's a bit harmful for people to use this kind of reasoning, but only practice it on things like AGI that will not be resolved for years and years (and maybe we'll all be dead when it does get resolved). Like ideally you'd be doing hand-wavy reasoning with precise probabilities about whether you should bring an umbrella on a trip, or applying for that job, etc. Then you get to practice with actual feedback and learn how not to make dumb mistakes while reasoning in that style.

      > And do we also pull this one out of thin air?

      That's what we do when training ML models sometimes. We'll have the model make a Gaussian distribution by supplying both a mean and a variance. (Pulled out of thin air, so to speak.) It has to give its best guess of the mean, and if the variance it reports is too small, it gets penalized accordingly. Having the model somehow supply an entire probability distribution is even more flexible (and even less communicable by mere rounding). Of course, as mentioned by commenter danlitt, this isn't relevant to binary outcomes anyways, since the whole distribution is described by a single number.

      3 replies →

  • > If the variance (uncertainty) in a number is large, correct thing to do is to just also report the variance

    I really wonder what you mean by this. If I put my finger in the air and estimate the emergence of AGI as 13%, how do I get at the variance of that estimate? At face value, it is a number, not a random variable, and does not have a variance. If you instead view it as a "random sample" from the population of possible estimates I might have made, it does not seem well defined at all.

    • I meant in a general sense that it's better when reporting measurements/estimates of real numbers to report the uncertainty of the estimate alongside the estimate, instead of using some kind of janky rounding procedure to try and communicate that information.

      You're absolutely right that if you have a binary random variable like "IMO gold by 2026", then the only thing you can report about its distribution is the probability of each outcome. This only makes it even more unreasonable to try and communicate some kind of "uncertainty" with sig-figs, as the person I was replying to suggested doing!

      (To be fair, in many cases you could introduce a latent variable that takes on continuous values and is closely linked to the outcome of the binary variable. Eg: "Chance of solving a random IMO problem for the very best model in 2025". Then that distribution would have both a mean and a variance (and skew, etc), and it could map to a "distribution over probabilities".)

No, you are right, this hyper-numericalism is just astrology for nerds.

  • In military they estimate distances this way if they don't have proper tools. Each says a min max range and then where there's most overlap, that will be taken. It's a reasonable way to make quick intuition based decisions when no other way is available.

    • If you actually try to flesh out the reasoning behind the distance estimation strategy it will turn out 100x more convincing than the analogous argument for bayesian probability estimates. (and for any bayesians reading this, please don't multiply the probability by 100)

If you take it with a grain of salt it's better than nothing. In life to express your opinion sometimes the best way is to quantify that based on intuition. To make decisions you could compile multiple experts intuitive quantities and use median or similar. There are some cases where it's more straight forward and rote, e.g. in military if you have to make distance based decisions, you might ask 8 of your soldiers to each name a number they think the distance is and take the median.

No you’re definitely not the only one… 10% is ok, 5% maybe, 1% is useless.

And since we’re at it: why not give confidence intervals too?

>Off topic, but am I the only one getting triggered every time I see a rationalist

The rest of the sentence is not necessary. No, you're not the only one.

You could look at 16% as roughly equivalent to a dice roll (1 in 6) or, you know, the odds you lose a round of Russian roulette. That's my charitable interpretation at least. Otherwise it does sound silly.

There is no honor in hiding behind euphemisms. Rationalists say ‘low confidence’ and ‘high confidence’ all the time, just not when they're making an actual bet and need to directly compare credences. And the 16.27% mockery is completely dishonest. They used less than a single significant figure.

  • > just not when they're making an actual bet

    That is not my experience talking with rationalists irl at all. And that is precisely my issue, it is pervasive in every day discussion about any topic, at least with the subset of rationalists I happen to cross paths with. If it was just for comparing ability to forecast or for bets, then sure it would make total sense.

    Just the other day I had a conversation with someone about working in AI safety, it when something like "well I think there is 10 to 15% chance of AGI going wrong, and if I join I have maybe 1% chance of being able to make an impact and if.. and if... and if, so if we compare with what I'm missing by not going to <biglab> instead I have 35% confidence it's the right decision"

    What makes me uncomfortable with this, is that by using this kind of reasoning and coming out with a precise figure at the end, it cognitively bias you into being more confident in your reasoning than you should be. Because we are all used to treat numbers as the output of a deterministic, precise, scientific process.

    There is no reason to say 10% or 15% and not 8% or 20% for rogue AGI, there is no reason to think one individual can change the direction by 1% and not by 0.3% or 3%, it's all just random numbers, and so when you multiply a gut feeling number by a gut feeling number 5 times in a row, you end up with something absolutely meaningless, where the margin of error is basically 100%.

    But it somehow feels more scientific and reliable because it's a precise number, and I think this is dishonest and misleading both to the speaker themselves and to listeners. "Low confidence", or "im really not sure but I think..." have the merit of not hiding a gut feeling process behind a scientific veil.

    To be clear I'm not saying you should never use numerics to try to quantify gut feeling, it's ok to say I think there is maybe 10% chance of rogue AGI and thus I want to do this or that. What I really don't like is the stacking of multiple random predictions and trying to reason about this in good faith.

    > And the 16.27% mockery is completely dishonest.

    Obviously satire

    • I wonder if what you observe is a direct effect of the rationalist movement worshipping the god of Bayes.