Comment by tptacek

1 month ago

It allows Claude to take screenshots and generate keyboard inputs. It's like TUI Playwright.

10 comments

tptacek

Maybe I'm not understanding it (totally possible!) but could Claude just do that by reading standard out and writing to standard in?

tptacek 1 month ago
I had a really hard time getting anything like that to work (you can't just read stdout and write stdin, because you're driving a terminal in raw mode), but it took like 3 sentences worth of Claude prompt to get Claude to use tmux to do this reliably.
- alehlopeh 1 month ago
  
  I tell Claude code to use an existing tmux session to interact with eg a rails console, and it uses tmux send-keys and capture-pane for IO. It gets tripped up if a pager is invoked, but otherwise it works pretty well. Didn’t occur to me to tell it to take screenshots.
  
  1 reply →
- mrstackdump 1 month ago
  
  I would love to see your prompt if you ever post it anywhere.
  
  2 replies →
rsanheim 1 month ago

Also many CLIs act differently when invoked connected to a terminal (TUI/interactive) vs not. So you’d run into issues there where Claude could only test the non-interactive things.

So by screenshots you mean tmux capture-pane, not actual screenshots. So in essence it is using stdout, just not Claude’s own.

wakawaka28 1 month ago

"In essence" but terminals do stuff to render stdout that you do not want a LLM to have to replicate, I think. If your TUI does stuff in fullscreen or otherwise with a bunch of control codes, that is simple work for a terminal but potentially intractable for a LLM.