Comment by vinodhkps

7 months ago

what does the feedback loop look like to your agents - wonder how hard it will be to generalize metrics across these agents!

1 comment

vinodhkps

akyshnik 7 months ago

feedback is generated based on evals. example: eval: function foo wasn't triggered even though [...]

feedback (exaggerated): 1. change stage prompt 2. change function description 3. add extra instructions to the end of the context

metrics are easy to generalize (e.g. call transfer rate), but baseline is different for each agent, so we're interpreting only the changes, not the absolute values (in the context of self-improvement).