Comment by chr15m
19 days ago
If you view LLM driven dev as a kind of evolutionary process rather than an engineering process (at the level of a single LLM output) then this makes a lot of sense. You're widening the population from which you select for fitness.
This was exactly the kernel of the idea :)
Ah interesting. Thank you very much for sharing the illuminating results.
One question I had - was the judgement blinded? Did judges know which models produced which output?
It was not, the agent id is not overt but can be found via the workspace filepath.
But that is a good point. Perhaps it should be mapped to something unidentifiable.
1 reply →