← Back to context

Comment by flash_us0101

8 days ago

Most replies here are about writing code faster. But there's a gap nobody's talking about: AI agents are completely blind to running systems.

When you hit a runtime bug, the agent's only tool is "let me add a print statement and restart". That works for simple cases but it's the exact same log-and-restart loop we fall back to in cloud and containerized environments, just with faster typing.

Where it breaks down: timing-sensitive code, Docker services, anything where restarting changes the conditions you need to reproduce.

I've had debugging sessions where the agent burned through 10+ restart cycles on a bug that would've been obvious if it could just watch the live values.

We've given agents the ability to read and write code. We haven't given them the ability to observe running code. That's a pretty big gap.

I've used agents to look at traces, stack dumps, and have used them to control things like debuggers. I've had them exec into running containers and poke around. I've had them examine metrics, look into existing logs, look at pcaps, and more. Any kind of command I could type into a console they can do, and they can reason about the outputs of such a command.

In fact last night I had it hacking away at a Wordpress template. It was making changes and then checking screenshots from a browser window automatically confirming it's changes worked as planned.

  • That's close to what I'm thinking about. Curious what debugger setup you're using with agents - are you giving them access via MCP or just having them run CLI commands?

easy, give the logs timestamps, the LLM can sort the order.

  • Timestamps aren't the issue. The problem is the cycle itself: stop the process, add the log line, restart, wait for the right conditions to hit that code path again. For anything timing-sensitive or dependent on external state, each restart changes what you're trying to observe.