← Back to context

Comment by jchung

2 days ago

@Felix - How are you thinking about observability? Anthropic is very clear that evals are critical for agentic processes (your engineering blog just covered this last week). For my whole company to roll out access to agents for all staff, I'd need some way for staff (or IT) to be able to know (a) how reliable the systems are (i.e., evals), (b) how safe the systems are (could be audit trails), and (c) how often the access being given to agents is the right amount of access.

This has been one of the biggest bottlenecks for our company: not the capability of the agents themselves -- the tools needed to roll them out responsibly.