Comment by Reubend

5 days ago

It's a very interesting concept, and I signed up to try it. However, after seeing the landing page, my first question was:

"Where's the data on accuracy?"

Backtesting is difficult to do correctly with LLMs, but because this is marketed as being for macro investing, I would expect to see a level of rigor and quantitative analysis consistent with that.

The Monte Carlo simulation engine sounds really cool, but is there evidence to indicate that it generates superior results to expert predictions, or to LLMs alone?

I actually think it would be totally fine for your beta version to have low accuracy numbers. After all, this seems to be something in the very early stages. But to have no quantitative analysis of your system's performance definitely makes me uneasy to trust it.

2 comments

Reubend

muggermuch 5 days ago

> because this is marketed as being for macro investing, I would expect to see a level of rigor and quantitative analysis consistent with that.

Thanks for bringing this up - while we talk about Soros' forecasts and comparing them against those of an LLM, in the end Soros is not a forecasting tool, it's an analytical framework.

There is a gap between quant modeling and geopolitical analysis that we seek to fill. Specifically, quant models are great at capturing statistical regularities in financial time series but typically treat geopolitical shocks as exogenous noise. Meanwhile, geopolitical analyses in the policy and intelligence communities (with the exception of Bueno de Mesquita [BdM]'s work) provide deep contextual reasoning but rarely produce probabilistic scenario structures or asset-level transmission mappings that can directly inform capital allocation.

We will be shortly publishing a technical preprint laying out the Soros framework in full, but the TL;DR is: we model geopolitical events (or crises in the literature) as partially observed ("fog of war") stochastic games with multiple actors jostling for control over resources. We map out actors across various axes (think of these as actor embeddings), identify key decision points, and enumerate paths across them to estimate scenario probabilities. The scenarios in turn have associated transmission flows and market implications. We will evaluate those as mentioned in the sibling comment. Happy to discuss more.

muggermuch 5 days ago

First, thank you so much for signing up to try out Soros!

You are absolutely right, of course, to ask about accuracy. TL;DR: we don't have any formal calibration data yet.

The reason why is interesting, though, and it strikes at the heart of global macro investing in particular: things change, often, and sometimes dramatically. Basically, geopolitical "events" are really smeared across time (and sometimes space). Each event update can lead to a cascade of new scenarios branching off and older ones dying out, each with implications on capital flow. It's difficult to disentangle, which is why our preference has been to enable the system itself to monitor feeds, but also update its alerts as it deems fit, and re-run the analysis when it feels there's been enough of a change of state (pun not intended).

One markets-focused eval we have been building towards (and apparently you have been thinking of as well) is comparing against LLMs. Our plan is to run simultaneous comparisons against a variety of frontier models, armed with the same information that we provide Soros, but without the structural framework and simulation engine we've built though. Ideally we want to map out the Pareto frontier of model capability vs realized returns, and examine performance over horizons, asset classes, and so on, and have concrete numbers on where Soros pushes the curve outwards.

This is being built :), and we hope to get there in the coming few weeks!