Comment by gravypod
5 hours ago
This is very interesting. I wonder if someone could create a future-sight benchmark for these models? Like, if given a set of newspaper articles for the past N months can it predict if certain world events would happen? We could backtest against results that have happened since the training cutoff.
FYI, ForecastBench[1] tests LLMs' out-of-sample forecasting accuracy
[1] https://www.forecastbench.org/