Comment by monkeydust

2 months ago

I have been thinking about this a bit - so rather than rely on one have an agentic setup that could take question run against the top 3 and then another one to judge the response to give back.

Is anyone doing this for high stake questions / research?

The argument against is that the models are fairly 'similar' as outlined in one of the awarded papers from Neurips '25 - https://neurips.cc/virtual/2025/loc/san-diego/poster/121421

1 comment

monkeydust

Workaccount2 2 months ago

I often put the models in direct conversation with each other to work out a framework or solution. It works pretty well, but they do tend to glaze each other a bit.