Comment by SlinkyOnStairs

5 days ago

There are two possibilities here:

1) This tool breaks the Claude TUI. Exactly as described by the comment.

2) The Claude TUI itself is broken. The comment is wrong, but assuming the "billion dollar TUI product" is capable of basic rendering and it's the wrapper that broke it, that is an entirely reasonable assumption

The fun here is that both of these softwares were made extensively using AI. No matter which of our options is the case here, the point stands. An AI-built product was shown, it looks obviously ass.

The issue is likely that the tmux session being generated is for some reason not propagating all term caps. Most likely it's an interop issue between tmux and docker and the image running under docker - possibly even something with the terminal client that the pipeline doesn't like somewhere.

Claude Code correctly reduces its display to 7-bit ASCII in response (still functional, although less pretty). Once I get around to fixing this, it will probably result in another section in https://github.com/kstenerud/yoloai/blob/main/docs/dev/backe...

Edit: Looks like it's the terminal. That's a rabbit hole for another day.

Running through VS Code's terminal via VSCode tunnel, it looks like it normally does.

https://freeimage.host/i/BySkkDN

  • What's really interesting in this comment chain is an observation I've expressed a lot more lately. When someone knows an LLM was involved they raise their expectations. I do it too in my own work and I have to remind myself things like "this bug would've also likely occurred with a human working at this level of complexity." The real question is did the operator arbitrarily and knowingly increase the level of complexity or is it appropriate for the task.

    • > The real question is did the operator arbitrarily and knowingly increase the level of complexity or is it appropriate for the task.

      There's one major reason to have higher expectations for autonomous systems (of all kinds, not just LLM-powered) than for humans, at least those intended to be deployed at scale, and that's the scale. If a human makes a mistake, has biases, or even intentionally breaks the rules the impact of their actions is limited by the nature of them being a human, where something like an autonomous driving system, a coding agent, etc. is intended to be deployed by the thousands, millions, or more and any problematic behaviors happen at that scale.

      There are obviously millions of bad drivers out there, but every one of the human ones is bad in different ways. If Waymo pushes a bad update there could be tens of thousands of "drivers" that suddenly become bad in identical ways.

      Humans also have the ability to learn from our mistakes. The ones you'd want to have working for you usually don't make the same one twice. LLMs are pretty good at making the same mistake repeatedly, even the simplest things like basic math or counting letters.

    • And there’s good reason for that. Anthropic, OpenAI, Salesforce, and so on have aggressively marketed LLMs as better than humans at working. It’s no surprise when we find out something is build using an LLM, we expect it to match the marketing.

      1 reply →

  > The Claude TUI itself is broken. 

I mean this is also true. You forgot the third option, that 1 and 2 are true (and 4th, that neither are).

Seriously, the Claude TUI fucking sucks. I don't know how anyone thinks otherwise. It breaks constantly if you enter your editor (<C-g>), or resizing windows/panes, or making another pane full screen, scrolling, or any number of things. It is objectively a bad piece of software.

And honestly, are we surprised? Anthropic says themselves that a lot of code is written by Claude. They've been saying that for years. If you look at agents now and think "man, agents a few years ago sucked" then this shouldn't be surprising at all! I mean FFS the thing spits out text and they designed it like a fucking game engine. It is silly