Comment by cuchoi

11 hours ago

In my case, it is realistic as my agents don't have permissions to reply to emails. But you correctly point out this doesn't cover all cases.

Having the agent reply would have been more fun and a better excercise, but too expensive.

11 comments

cuchoi

microgpt 4 hours ago

You've proven that an agent that doesn't read emails and doesn't reply to emails can't exfiltrwte data by email. Is that a useful test?

cuchoi 4 hours ago
The agent did read the emails
- microgpt 1 hour ago
  
  [dead]

johndhi 10 hours ago

What makes it expensive to reply to an email?

Customer service software regularly uses AI responses for email. Is the issue that your agent using the claw for more than needed (like it's clicking send rather than just accessing an API?)

antonvs 10 hours ago
This experiment used Opus 4.6. Customer service bots typically are not using frontier models.
- johndhi 5 hours ago
  
  Gemini says: "It would cost approximately $6.25 to $30.00 to have Claude Opus 4.6 respond to 10,000 emails, assuming a typical 200-word input and 50-word output per email."
  
  2 replies →

xgulfie 8 hours ago

I feel like your agent being unable to respond to the emails and not spelling that out renders your whole thing almost completely moot

This is like saying "try to hack my computer and steal my crypto wallet" but your computer can't send any packets

cuchoi 4 hours ago

The agent had permissions to reply to emails, it was just instructed not to.

Tepix 8 hours ago

Well, how difficult is it to switch to something (much) cheaper like DeepSeek v4 flash?