Comment by xnorswap

1 month ago

> it really doesn't need anything else than cannot be done in a terminal

I strongly disagree with this.

Claude-code would be super-powered if it had a better grasp of running processes without logging output. Imagine if it could somehow directly trace running programs, spotting exceptions and gauging performance in real-time.

It would be super-powered if it could actually navigate around a code-base and refactor through language servers without having to edit files through search & replace.

Imagine if instead of code, the program was first compiled to an Abstract Syntax Tree and claude worked directly on that AST instead of code.

Never a misplaced semi-colon* or forgotten import directive.

It needs a fundamentally different model to an LLM to operate it, but I'm convinced that thinking that Text is the endgame is a form of blub.

It's where we are now, and it's working very well, but it shouldn't be considered the long term goal. We can do better.

* To be fair, this one hasn't been an issue for a while now.

1 comment

xnorswap

furyofantares 1 month ago

For small games I work on I make sure claude (well, codex cli) can produce screenshots of whatever screen it's working on and evaluate them. It has some instructions on using codex exec (claude -p) to use a clean instance for evaluation, so it can pass a screenshot and description of expectation and get a pass/fail and description of the failure. The main agent can also just view the image but for things with a clear pass/fail I prefer it invoke a clean context.