Comment by BurningFrog
5 hours ago
This makes sense, but a logical next step is to have one AI write code, and then have another AI, instead of humans, verify it.
Or are current AIs too similar for that to be fruitful?
5 hours ago
This makes sense, but a logical next step is to have one AI write code, and then have another AI, instead of humans, verify it.
Or are current AIs too similar for that to be fruitful?
This is commonly known as "LLM-as-a-judge" and anecdotally multiple people I know who write code using OpenRouter or using multiple models say it's surprisingly effective. It's strange that there don't appear to be any major papers on it since ~early 2025, which at this point is basically ancient history.