Comment by angry_octet

15 hours ago

I don't trust the harness, and I especially don't trust that the LLM won't be able to subvert the harness, or trick me via the harness. I assume that the LLM will be able to leak any secret in the harness context to arbitrary internet destinations, or somehow encode the secret in a work product. Eg space characters at the end of lines encoding access tokens.

Having the harness in one VM, and tool use applied to user data in another, is about as safe as you can be at present. You can mount filesystem fragments from the data VM into the harness VM, but tool execution remains painful.

Having all authorisation and access control exist outside of the harness layer is essential. It should only have narrowly scoped and time limited credentials that are bound to its IP, and even then that is problematic.

2 comments

angry_octet

wswope 1 hour ago

My approach to this has been a NixOS host with the harness running in a rootless podman sidecar.

The host has squid configured with a self-signed CA and networking rules to route all host traffic to the intercepting proxy, so I have a tight firewall and full auditability.

Then there’s a python rpc daemon running on the host with a set of whitelisted commands, read-only for pulling logs and diagnostics.

By default, the agent runs in a split pane tmux session with a host shell on the left and the chat interface on the right. The rpc whitelist includes the proper `tmux capture-pane` invocation to pull from the host shell, so I can easily let it see what I’m doing if I want it to help debug something.

I’m using pi as my harness and have custom extensions that give Yes/No confirmation gates for any writes the agent makes and that pass all bash commands/file writes to a deepseek subagent for review.

Still early days, but as someone with a similarly paranoid mindset around running LLMs securely, I think the future is promising and we’ll see some new “best practices” and related tooling popping up shortly.

angry_octet 1 hour ago

NixOS is a great place to start from.
Trusted observability will be key. Why am I giving the harness the ability to read/modify files when the harness lives in the same action space as tools? No, the gates should be controlled elsewhere, and even when I have given carte blanche, I want to see what has been done, step by step. So a controlled CA that allows for inspection of requests is great for logging.