Comment by Rperry2174
2 months ago
One thing this really highlights to me is how often the "boring" takes end up being the most accurate. The provocative, high-energy threads are usually the ones that age the worst.
If an LLM were acting as a kind of historian revisiting today’s debates with future context, I’d bet it would see the same pattern again and again: the sober, incremental claims quietly hold up, while the hyperconfident ones collapse.
Something like "Lithium-ion battery pack prices fall to $108/kWh" is classic cost-curve progress. Boring, steady, and historically extremely reliable over long horizons. Probably one of the most likely headlines today to age correctly, even if it gets little attention.
On the flip side, stuff like "New benchmark shows top LLMs struggle in real mental health care" feels like high-risk framing. Benchmarks rotate constantly, and “struggle” headlines almost always age badly as models jump whole generations.
I bet theres many "boring but right" takes we overlook today and I wondr if there's a practical way to surface them before hindsight does
"Boring but right" generally means that this prediction is already priced in to our current understanding of the world though. Anyone can reliably predict "the sun will rise tomorrow", but I'm not giving them high marks for that.
I'm giving them higher marks than the people who say it won't.
LLMs have seen huge improvements over the last 3 years. Are you going to make the bet that they will continue to make similarly huge improvements, taking them well past human ability, or do you think they'll plateau?
The former is the boring, linear prediction.
>The former is the boring, linear prediction.
right, because if there is one thing that history shows us again and again is that things that have a period of huge improvements never plateau but instead continue improving to infinity.
Improvement to infinity, that is the sober and wise bet!
9 replies →
LaunchHN: Announcing Twoday, our new YC backed startup coming out of stealth mode.
We’re launching a breakthrough platform that leverages frontier scale artificial intelligence to model, predict, and dynamically orchestrate solar luminance cycles, unlocking the world’s first synthetic second sunrise by Q2 2026. By combining physics informed multimodal models with real time atmospheric optimisation, we’re redefining what’s possible in climate scale AI and opening a new era of programmable daylight.
2 replies →
> Are you going to make the bet that they will continue to make similarly huge improvements
Sure yeah why not
> taking them well past human ability,
At what? They're already better than me at reciting historical facts. You'd need some actual prediction here for me to give you "prescience".
13 replies →
LLMs aren't getting better that fast. I think a linear prediction says they'd need quite a while to maybe get "well past human ability", and if you incorporate the increases in training difficulty the timescale stretches wide.
> The former is the boring, linear prediction.
Surely you meant the latter? The boring option follows previous experience. No technology has ever not reached a plateau, except for evolution itself I suppose, till we nuke the planet.
Perhaps a new category, 'highest risk guess but right the most often'. Those is the high impact predictions.
Prediction markets have pretty much obviated the need for these things. Rather than rely on "was that really a hot take?" you have a market system that rewards those with accurate hot takes. The massive fees and lock-up period discourage low-return bets.
3 replies →
something like correctness^2 x novel information content rank?
Actually now thinking about it, incorrect information has negative value so the metric should probably reflect that.
The one about LLMs and mental health is not a prediction but a current news report, the way you phrased it.
Also, the boring consistent progress case for AI plays out in the end of humans as viable economic agents requiring a complete reordering of our economic and political systems in the near future. So the “boring but right” prediction today is completely terrifying.
“Boring” predictions usually state that things will continue to work the way they do right now. Which is trivially correct, except in cases where it catastrophically isn’t.
So the correctness of boring predictions is unsurprising, but also quite useless, because predicting the future is precisely about predicting those events which don’t follow that pattern.
[dead]
Instead of "LLM's will put developers out of jobs" the boring reality is going to be "LLM's are a useful tool with limited use".
That is at odds with predicting based on recent rates of progress.
This suggests that the best way to grade predictions is some sort of weighting of how unlikely they were at the time. Like, if you were to open a prediction market for statement X, some sort of grade of the delta between your confidence of the event and the “expected” value, summed over all your predictions.
Exactly, that's the element that is missing. If there are 50 comments against and one pro and that pro has it in the longer term then that is worth noticing, not when there are 50 comments pro and you were one of the 'pros'.
Going against the grain and turning out right is far more valuable than being right consistently when the crowd is with you already.
Yeah a simple of total points of pro comments vs total points of con comments may be simple and exact enough to simulate a prediction market. I don't know if it can be included in the prompt or better to be vibecoded in directly.
I predict that, in 2035, 1+1=2. I also predict that, in 2045, 2+2=4. I also predict that, in 2055, 3+3=6.
By 2065, we should be in possession of a proof that 0+0=0. Hopefully by the following year we will also be able to confirm that 0*0=0.
(All arithmetic here is over the natural numbers.)
It's because algorithmic feeds based on "user engagement" rewards antagonism. If your goal is to get eyes on content, being boring, predictable and nuanced is a sure way to get lost in the ever increasing noise.
> One thing this really highlights to me is how often the "boring" takes end up being the most accurate.
Would the commenter above mind sharing the method behind of their generalization? Many people would spot check maybe five items -- which is enough for our brains to start to guess at potential patterns -- and stop there.
On HN, when I see a generalization, one of my mental checklist items is to ask "what is this generalization based on?" and "If I were to look at the problem with fresh eyes, what would I conclude?".
Is this why depressed people often end up making the best predictions?
In personal situations there's clearly a self fulfilling prophecy going on, but when it comes to the external world, the predictions come out pretty accurate.