Ask HN: Do you know what data your AI coding agent sends to the cloud?

9 hours ago

Every session my AI coding agent reads files, runs commands, makes API calls. I have no idea exactly what ends up in the cloud. Is anyone actually tracking this at a granular level, or do we just trust the tool?

I trust the tool in that I don't send anything sensitive in there! Unless I built it, I assume it's going somewhere.

We have a policy at work around this where our most sensitive data can only be passed to on prem models.

That being said, I have no evidence of anything going to the cloud or frontier providers doing anything with chat history other than storing it for later.

Self-hosted + custom harness for anything I don't want getting out at all.

  • Makes sense. Does your custom harness give you a record of what actually crossed the boundary, or is it mostly trust-based blocking?

    • My harness is only being used with on prem models, so I don't have any checks in place. If the gguf is somehow calling home, I'm not catching it.

You don't. Even if you read the policy, it would be jumbled in legalese. Instead, give it access to only the kind of data you are okay with being sent to the cloud. Also, the company reputation at stake matters more than their policies.