Comment by zbentley
12 hours ago
Very neat! I like this a lot, nice work.
After peeking the source, a few possible areas of improvement:
- You can use `fstat` and keep a file handle around, likely further improving performance (well, reducing the performance hit to other users of the filesystem by not resolving vfs nodes). If you do this, you'll have to check for file deletions.
- If you do stick with stat(2), it might be a good idea to track the inode number from the stat result in addition to the time,size tuple. That handles the "t,s = 1,2; honker gets SIGSTOPped/CRIU'd; database file replaced; honker started again", as well as renameat/symlink-swap fiddling. Changing inode probably should just trigger a crash.
- Also check the device number from the stat call. It sounds fringe, but the number of weird hellbugs I've dealt with in my career caused by code continually interacting with a file at the same time as something else mounted an equivalent path "over" the directory the file was originally in is nonzero.
- It's been a few years since I fought with this, but aren't there edge cases here if the system clock goes backwards? IIRC the inode timestamp isn't monotonic--right? There are various strategies for detecting clock adjustment, of various reliability, that you could use here, if so. Just checking if the mtime-vs-system-clock diff is negative is a start.
That covers the more common of the "vanishingly uncommon but I've still seen 'em" cases related to file modification detection. Whether you choose to cope with people messing with the file via utime(2) is up to you (past a point, it feels like coping with malicious misuse rather than edge cases). But since your code runs in a loop, you're well-positioned to do that (and detect drift/manipulations of the system clock): track a monotonic clock and use it to approximate the elapsed wall time between honker poller ticks (say it fast with an accent, and you get https://www.bbc.com/news/world-latin-america-11465127); if the timestamp reported by (f)stat(2) ever doesn't advance at the same rate, fall back to checksumming the file, or crashing or something. But this is well into the realm of abject paranoia by now.
It's been a decade or so since I worked in this area, so some of that knowledge is likely stale; you probably know a lot more than I do after developing this library even before considering how out-of-date my knowledge might be. When I worked on this stuff, I remember that statx(2) was going to solve all the problems any day now, and then didn't. More relevant, I also remember that the lsyncd (https://github.com/lsyncd/lsyncd) and watchman (https://github.com/facebook/watchman) codebases were really good sources of "what didn't I think of" information in this area.
But seriously, again, nice work! Those are nitpicks; this is awesome as-is!
Wow, thanks for the great feedback.
I actually looked at fstat, but the "check for deletions" piece, given I'm polling at 1kHZ, was the reason I decided not to use it. Older hardware actually made this a big issue but it's fast enough now I decided it wasn't a problem.
I'll ignore the malicious ones bc [out of scope declaration]. Object paranoia is an artifact of build trama and I respect that lmao.
I've just looked into the device number and system clock issues. I think what i'll end up doing is actually a combo of ncruces's above comment and your feedback: a 1kHZ data_version and a 10HZ stat() with version check. This gets around syscall load, avoid clock issues, avoids the WAL truncation issues that others have mentioned, and is both lighter weight and less bugabooable than my previous design.
Thanks again.