← Back to context

Comment by herf

7 years ago

I spent many years optimizing "stat-like" APIs for Picasa - Windows just feels very different than Linux once you're benchmarking.

It turns out Windows/SMB is very good at "give me all metadata over the wire for a directory" and not so fast at single file stat performance. On a high-latency network (e.g. Wi-Fi) the Windows approach is faster, but on a local disk (e.g., compiling code), Linux stat is faster.

You've done an amazing job.

This is off-topic, but is there any chance of bringing the Picasa desktop client back to the masses?

There's nothing out there that matches Picasa in speed for managing large collections (especially on Windows). The Picasa Image Viewer is lightning-fast, and I still use them both daily.

There are, however, some things that could be improved (besides the deprecated functionality that was gone when Picasa Online was taken away); e.g. "Export to Folder" takes its sweet time. But with no source out there, and no support from the developers, this will not, sadly, happen.

I'm mostly clueless about Windows, so bear with me, but that makes no sense to me.

If SMB has some "give me stat info for all stuff in a directory" API call then that's obviously faster over the network since it eliminates N roundtrips, but I'd still expect a Linux SMB host to beat a Windows SMB host at that since FS operations are faster, the Linux host would also understand that protocol.

Unless what you mean is that Windows has some kernel-level "stat N" interface, so it beats Linux by avoiding the syscall overhead, or having a FS that's more optimized for that use-case. But then that would also be faster when using a SMB mount on localhost, and whether it's over a high-latency network wouldn't matter (actually that would amortize some of the benefits).

  • I think the idea is that you're accessing files sparsely and/or randomly.

    With the linux approach you avoid translating (from disk representation to syscall representation) metadata you don't need, and the in-memory disk cache saves having to re-read it (and some filesystems require a seek for each directory entry to read the inode data structure, which can also be avoided it you don't care about that particular stat).

    With the windows approach, the kernel knows you want multiple files from the same directory, so it can send a (slightly more expensive) bulk stat request, using only one round trip[0]. On linux, the kernel doesn't know whether you're grabbing a.txt,b.txt,... (single directory-wide stat) or foo/.get,bar/.git,... (multiple single stats that could be pipelined) or just a single file, so it makes sense to use the cheapest request initially. If it then sees another stat in the same directory, it might make a bulk request, but that still incurred a extra round trip, and may have added useless processing overhead if you only needed two files.

    TLDR: Access to distant memory is faster if assuptions can be made about your access patterns, access to local memory is faster if you access less of the local memory.

    0: I'm aware of protocol-induced round trips, but I don't think it effects the reasoning.

    • just think of they way the OSs are used.

      get to a directory:

      linux: cd /dir (no info) windows: open directory ... all the info and different views depending on your selection currently like image file thumbnails

      in windows you are always accessing this meta data so it makes sense to speed it up. while in linux even the ls fucking does give you meta data you have to add the extra options so it doesnt makes sense to speed up and waste storage on something that is infrequent

      seem like both ways are sound

  • > If SMB has some "give me stat info for all stuff in a directory" API call

    It does, it supports FindFirstFile/FindNextFile[1], which returns a struct of name, attributes, size and timestamps per directory entry.

    Now I'm not sure how Linux does things, but for NTFS, the data from FindFirstFile is pulled from the cached directory metadata, while the handle-based stat-like APIs operate on the file metadata. When the file is opened[2], the directory metadata is updated from the file metadata.

    So while it does not have a "stat N" interface per se, the fact that it returns cached metadata in an explicit enumeration-style API should make it quite efficient.

    [1]: https://docs.microsoft.com/en-us/windows/desktop/api/fileapi... [2]: https://blogs.msdn.microsoft.com/oldnewthing/20111226-00/?p=...

    • I'm not sure how FindFirstFile/FindNextFile is going to be better than readdir(3) on Unix.

      At the NT layer, beneath FindFirstFile/FindNextFile, there is a call that says "fill this buffer with directory entry metadata." - https://docs.microsoft.com/en-us/windows/desktop/devnotes/nt... - I know FindFirstFileEx for example can let you ask for a larger buffer size to pass to that layer, thereby reducing syscall overhead in a big directory.

      If you look at getdirentries(2) on FreeBSD for example - https://www.freebsd.org/cgi/man.cgi?query=getdirentries - it's a very similar looking API. I thought I recall hearing that in the days before readdir(3) the traditional approach was to open(2) a dir and read(2) it, but I cannot find a source for that claim. At any rate you can imagine something pretty identical in the layer beneath readdir(3) on a modern Unix-like system and it being essentially the same as what Windows does.

      I guess file size needs an extra stat(2) in Unix, since it is not in struct dirent, so if you do care about that or some of the other WIN32_FIND_DATA members the Windows way will be faster.

      1 reply →

  • If by "host" you mean the client rather than the server, and if I understand correctly, the problem I anticipate would be that the API doesn't allow you to use that cached metadata, even if the client has already received it, because there's no guarantee that when you query a file inside some folder, it'll be the same as it was when you enumerated that folder, so I'd assume you can't eliminate the round trip without changing the API. Not sure if I've understood the scenario correctly but that seems to be the issue to me.

Anectodally[1] javac dos full builds because reading everything is faster than statting every file comparing their compiled version. Eclipse works around this keeping a change list in memory, which had its own drawback with external changes pushing the workspace out of sync

[1] I can't find a source on this but I remember having read it a long time ago, so I'll leave it at that unless I can find an actual autoritative source.

You know any alternative to Picasa? Especially in regards to face recognition? Google Photos is objectively shit, as you need to upload all photos for that.

  • There is digikam for Linux (KDE) with facial recognition, I just started playing with it last night, I tested it on a small group of photos, and good so far,

    • Tried it, unfortunately it has huge problems with large collections. Face recognition is sadly way worse too.

  • Depending on your platform - Apple's Photos is pretty good with facial recognition, and it's all done on device.

    • On which device? I can install macOS in a VM. Transferring all photos to a iPhone just for tagging seems ridiculous.

I cant quite recall the exact number - but wasnt the packet count for an initial listing on windows SMB something like ~45 packets/steps in the transaction for each file?

Like I said - it was years ago, but I recall it being super chatty...