Comment by flakes
2 days ago
Jenkins is a place where you can be safe for a long time, however, it starts to break down at scale. I see it time after time for these batch workflow jobs. At the start, jobs run in seconds and everyone is happy.
Over time, jobs start taking long enough to the point where you need to split them. Separate jobs are assigned slices of the original batch. Eventually, there are so many slices that you make a Jenkins job where the sole responsibility is firing off these individual jobs.
Then you start hitting the real painpoints in Jenkins. Poor allocation of jobs across your nodes/agents, often overloading CPU/Mem on machines, and you struggle to manage the ungodly interface that is the Jenkins REST endpoint. You install many Jenkins addons to try and address the scheduling problems, and end up with a team dedicated to managing this Jenkins infrastructure.
The scaling struggles continue to amass and you end up needing separate Jenkins instances to battle the load. Any attempt at replacing the Jenkins infrastructure goes on standstill, as the amount of random scripts found in Jenkinsfiles has created an insurmountable vendor lock-in.
You read a post about a select-for-update job scheduler and reflect on simpler times. You cry as you refactor your Jenkins Groovy DSL.
it’s actually much more common than you think for people to reuse CI systems for cron tasking.
It’s always a mistake, but it’s easy in the moment and sticks around longer than I’d like.
CI systems like Jenkins are there and they're corp-approved.
Getting a weird 3rd party scheduling system with access to internal stuff approved is HARD in big corps.
So we (ab)use the CI system we have. It has scheduling and it already accesses internal resources.
What about Camunda? It’s a corporate workflow engine.
What's the thing you should replace Jenkins with at scale?
Im a firm believer that there will never be a perfect general purpose job scheduler. The priority for how jobs are scheduled is always deeply coupled to your business needs. General purpose schedulers always end up as a jack of all trades but master of none. With a custom built scheduler you get that control, but do have to re-invent the wheel for a lot of features. Jenkins, Argo, Airflow, Cron, etc, all have their own pros and cons.