Comment by garfij
7 days ago
We've done varying forms of this to differing degrees of success at work.
Dynamic, on-the-fly generation & execution is definitely fascinating to watch in a sandbox, but is far to scary (from a compliance/security/sanity perspective) without spending a lot more time on guardrails.
We do however take note of hallucinated tool calls and have had it suggest an implementation we start with and have several such tools in production now.
It's also useful to spin up any completed agents and interrogate them about what tools they might have found useful during execution (or really any number of other post-process questionnaire you can think of).
>Dynamic, on-the-fly generation & execution is definitely fascinating to watch in a sandbox, but is far to scary (from a compliance/security/sanity perspective) without spending a lot more time on guardrails.
Would love love love to hear more on what you are doing here? This seems super fascinating (and scary). :)