Comment by sinker

4 hours ago

I've started experimenting with claude to run end-to-end tests for an emacs package I'm developing. It's incredible.

The way it works:

1. You start emacs in daemon (server) mode.

2. You prompt claude to instantiate an emacs client and write/run tests.

That's it.

Claude will then "pilot" an emacs instance where you can visibly see it running tests. Since almost everything is a first-class function in emacs, and emacs enables almost complete introspection, claude can debug the code in the execution environment. You can also just look at the piloted emacs instance and prompt the agent on what's wrong in the running application state.

This is much more thorough then just having claude write unit tests because many of the issues you might encounter are visual/gui things - which, again, because emacs allows so much introspection, can be examined by looking at the current application state.