Comment by xyzsparetimexyz
10 hours ago
I doubt they'd do a very good job of debugging a gpu crash, or visual noise caused by forgotten synchronization, or odd looking shadows.
Mayybe for some things you could set it up so that the screen output is livestreamed back into the agent, but I highly doubt that anyone is doing that for agents like this yet
> Mayybe for some things you could set it up so that the screen output is livestreamed back into the agent, but I highly doubt that anyone is doing that for agents like this yet
What do you mean by streaming? LLMs aren’t that advanced yet where they can consume a live video feed but people have been feeding them screenshots from Playwright and desktop apps for years (Anthropic even released the Computer Use feature based on this).
Gemini has the best visual intelligence but all three of the major models have supported this for a while. I don’t think it’d help with fixing subtle problems in shadows but it can fix other gui bugs using visual feedback.
I am a GPU programmer (on the compute side), and the biggest challenge is lack of tooling.
For host-side code the agent can throw in a bunch of logging statements and usually printf its way to success. For device-side code there isn't a good way to output debugging info into a textual format understandable by the agent. Graphical trace viewers are great for humans, not so great for AI right now.
On the other hand, Cline's harness can interact with my website and click on stuff until the bugs are gone.
(Shamless plug) I've been using my debugger-cli [1] to enable agents to debug code using debuggers that support the Debug Adaptor Protocol. It looks like cuda-gdb supports DAP so I'd love to add support. I just need help from someone who can test it adequately (kernels/warps/etc don't quite translate to a generic DAP client implementation).
[1] https://github.com/akiselev/debugger-cli
This is great. I hate LLMs fiddling around with logging calls to get some debugging capability.
Now they can be promoted from junior coders into mid-level coders :)