← Back to context

Comment by throwawayffffas

5 hours ago

> We also built a way to simulate what an agent would have seen at any point in the past. Each model gets access to market data, news APIs, company financials—but all time filtered: agents see only what would have been available on that specific day during the test period.

That's not going to work, these agents especially the larger ones, will have news about the companies embedded in their weights.

Funny how if you kept reading before commenting, they addressed that point specifically

> We were cautious to only run after each model’s training cutoff dates for the LLM models. That way we could be sure models couldn’t have memorized market outcomes.