Comment by BurningFrog

5 hours ago

This makes sense, but a logical next step is to have one AI write code, and then have another AI, instead of humans, verify it.

Or are current AIs too similar for that to be fruitful?

1 comment

BurningFrog

suttontom 3 hours ago

This is commonly known as "LLM-as-a-judge" and anecdotally multiple people I know who write code using OpenRouter or using multiple models say it's surprisingly effective. It's strange that there don't appear to be any major papers on it since ~early 2025, which at this point is basically ancient history.