Comment by crvdgc

2 hours ago

Among the six patterns identified, it's interesting that "Iterative AI Debugging" takes more time (and possibly tokens) but results in worse scores than letting AI do everything. So this part really should be handed over to agent loops.

The three high score patterns are interesting as well. "Conceptual Inquiry" actually results in less time and doesn't improve the score than the other two, which is quite surprising to me.