Comment by jpalomaki

2 months ago

Learned once the hard way that it makes sense to use "flock" to prevent overlapping executions of frequently running jobs. Server started to slow down, my monitoring jobs started piling, causing server to slow down even more.

  */5 * * * * flock -n /var/lock/myjob.lock /usr/local/bin/myjob.sh

7 comments

jpalomaki

chillaranand 2 months ago

We can also use systemd timer and it ensures there is no overlap.

https://avilpage.com/2024/08/guide-systemd-timer-cronjob.htm...

cluckindan 2 months ago

Have you tested how this behaves on eventually consistent cloud storage?

atherton94027 2 months ago
I'm confused, is EBS eventually consistent? I assume that it's strongly consistent as otherwise a lot of other linux things would break
If you're thinking about using NFS, why would you want to distribute your locks across other machines?
- cluckindan 2 months ago
  
  Why would anyone want a distributed lock?
  Sometimes certain containerized processes need to run according to a schedule, but maintainers also need a way to run them manually without the scheduled processing running or starting concurrently. A shared FS seems like the ”simplest thing that could possibly work” distribution method for locks intended for that purpose, but unfortunately not all cloud storage volumes are strongly consistent, even to the same user, and may take several ms for the lock to take hold.
  
  2 replies →
garganzol 2 months ago

If a file system implements lock/unlock functions precisely to the spec, it should be fully consistent for the file/directory that is being locked. Does not matter if the file system is local or remote.
In other words, it's not the author's problem. It's the problem of a particular storage that may decide to throw the spec out of the window. But even in an eventually consistent file system, the manufacturer is better off ensuring that the locking semantics is fully consistent as per the spec.