Comment by jaen

5 days ago

People building these Rube Goldberg contraptions: Do you actually run evaluations if this is any better at all than eg. giving it access to a Python REPL, or just toughing it out with random tools composed via shell scripts?

Why would an LLM be better trained to access Redis via a FS vs. a native library API?

Makes no sense.

5 comments

jaen

sgbeal 5 days ago

> Why would an LLM be better trained to access Redis via a FS vs. a native library API?

Limiting the potential blast radius.

If you give an agent "access to a Python REPL" (your words), you're giving it access to all of Python. i.e. you're paving the road to your own destruction when the agent goes awry. In the case of a Python interpreter, you're basically handing it an eight-lane highway upon which all sorts of pile-ups and havoc can happen.

By limiting its access to specific operations via well-defined endpoints (which is what the AGFS approach is), you're trimming that eight-lane highway back to a bicycle path.

jaen 3 days ago

That didn't answer or reply to the original question... Security is orthogonal to performance.
My question was, how is the performance better? (as implied by the word evaluations)
(also the original post was about exposing all sorts of random ops via a file system which are accessed via general shell tools most of the time, so pretty likely there's basically zero added security...)
calvinmorrison 5 days ago

then again with plan9 namespaces it's trivial to build sandboxes.

dmos62 5 days ago

Why are you upset?

jaen 3 days ago

(I'm not upset at all, I'm confused.)