Comment by SonuSitebot

3 days ago

I've worked on a similar setup in Go — managing a pool of "always-on" containers for isolated task execution via docker exec. The official Docker SDK is solid but pretty low-level, so I get the desire for something more ergonomic. In my experience, there aren't many off-the-shelf Go libraries that give you full orchestration primitives (load balancing, health checks, scheduling) out of the box like you'd find in Nomad or K8s. But here are a few options worth exploring:

gofiber/fiber – not container-specific, but useful for building lightweight async schedulers if you're rolling your own orchestration logic.

dockertest – primarily for testing, but you can adapt its logic for simplified lifecycle management.

hashicorp/go-plugin – good for decoupling workloads, especially if you're considering container-based isolation per plugin/command.

That said, most teams I’ve seen build their own lightweight layer on top of the Docker SDK with Redis or internal queues for tracking load/health. Curious if you're doing multi-host management or keeping this local? Also, make sure to aggressively timeout and clean up zombie exec sessions — they sneak up fast when you're doing docker exec a lot.

Would love to hear more if you open source anything from this!

Our use case is to execute test scripts in a sandbox mode. This is multi host and multi region setup. We might run millions of test scripts per day.

One of our engineers found https://testcontainers.com. We find it interesting and it seems like it won’t maintain container live. Instead, it start and remove the container for each test. We might need to implement lock mechanism to run only maximum number of containers at a time. I don’t know whether it fits for highly scalable test cases.