Comment by yomismoaqui

1 month ago

In an LLM world text will also be is king.

Sure, LLMs can understand images and video, but when you make your program spit debug text you make it easier and faster for Claude Code to iterate on it and fix any problems.

See how much value does a text UI program like Claude Code provide, it really doesn't need anything else than cannot be done in a terminal.

2 comments

yomismoaqui

xnorswap 1 month ago

> it really doesn't need anything else than cannot be done in a terminal

I strongly disagree with this.

Claude-code would be super-powered if it had a better grasp of running processes without logging output. Imagine if it could somehow directly trace running programs, spotting exceptions and gauging performance in real-time.

It would be super-powered if it could actually navigate around a code-base and refactor through language servers without having to edit files through search & replace.

Imagine if instead of code, the program was first compiled to an Abstract Syntax Tree and claude worked directly on that AST instead of code.

Never a misplaced semi-colon* or forgotten import directive.

It needs a fundamentally different model to an LLM to operate it, but I'm convinced that thinking that Text is the endgame is a form of blub.

It's where we are now, and it's working very well, but it shouldn't be considered the long term goal. We can do better.

* To be fair, this one hasn't been an issue for a while now.

furyofantares 1 month ago

For small games I work on I make sure claude (well, codex cli) can produce screenshots of whatever screen it's working on and evaluate them. It has some instructions on using codex exec (claude -p) to use a clean instance for evaluation, so it can pass a screenshot and description of expectation and get a pass/fail and description of the failure. The main agent can also just view the image but for things with a clear pass/fail I prefer it invoke a clean context.