Comment by echelon
8 hours ago
We need SSH access to the failed instances so we can poke around and iterate from any step in the workflow.
Production runs should be immutable, but we should be able to get in to diagnose, edit, and retry. It'd lead to faster diagnosis, resolution, and fixing.
The logs and everything should be there for us.
And speaking of the logs situation, the GHA logs are really buggy sometimes. They don't load about half of the time I need them to.
I wrote something recently with webrtc to get terminal on failure: https://blog.gripdev.xyz/2026/01/10/actions-terminal-on-fail...
Are there solutions to this like https://github.com/marketplace/actions/ssh-to-github-action-... ?