Comment by avar

7 years ago

I'm mostly clueless about Windows, so bear with me, but that makes no sense to me.

If SMB has some "give me stat info for all stuff in a directory" API call then that's obviously faster over the network since it eliminates N roundtrips, but I'd still expect a Linux SMB host to beat a Windows SMB host at that since FS operations are faster, the Linux host would also understand that protocol.

Unless what you mean is that Windows has some kernel-level "stat N" interface, so it beats Linux by avoiding the syscall overhead, or having a FS that's more optimized for that use-case. But then that would also be faster when using a SMB mount on localhost, and whether it's over a high-latency network wouldn't matter (actually that would amortize some of the benefits).

I think the idea is that you're accessing files sparsely and/or randomly.

With the linux approach you avoid translating (from disk representation to syscall representation) metadata you don't need, and the in-memory disk cache saves having to re-read it (and some filesystems require a seek for each directory entry to read the inode data structure, which can also be avoided it you don't care about that particular stat).

With the windows approach, the kernel knows you want multiple files from the same directory, so it can send a (slightly more expensive) bulk stat request, using only one round trip[0]. On linux, the kernel doesn't know whether you're grabbing a.txt,b.txt,... (single directory-wide stat) or foo/.get,bar/.git,... (multiple single stats that could be pipelined) or just a single file, so it makes sense to use the cheapest request initially. If it then sees another stat in the same directory, it might make a bulk request, but that still incurred a extra round trip, and may have added useless processing overhead if you only needed two files.

TLDR: Access to distant memory is faster if assuptions can be made about your access patterns, access to local memory is faster if you access less of the local memory.

0: I'm aware of protocol-induced round trips, but I don't think it effects the reasoning.

  • just think of they way the OSs are used.

    get to a directory:

    linux: cd /dir (no info) windows: open directory ... all the info and different views depending on your selection currently like image file thumbnails

    in windows you are always accessing this meta data so it makes sense to speed it up. while in linux even the ls fucking does give you meta data you have to add the extra options so it doesnt makes sense to speed up and waste storage on something that is infrequent

    seem like both ways are sound

> If SMB has some "give me stat info for all stuff in a directory" API call

It does, it supports FindFirstFile/FindNextFile[1], which returns a struct of name, attributes, size and timestamps per directory entry.

Now I'm not sure how Linux does things, but for NTFS, the data from FindFirstFile is pulled from the cached directory metadata, while the handle-based stat-like APIs operate on the file metadata. When the file is opened[2], the directory metadata is updated from the file metadata.

So while it does not have a "stat N" interface per se, the fact that it returns cached metadata in an explicit enumeration-style API should make it quite efficient.

[1]: https://docs.microsoft.com/en-us/windows/desktop/api/fileapi... [2]: https://blogs.msdn.microsoft.com/oldnewthing/20111226-00/?p=...

  • I'm not sure how FindFirstFile/FindNextFile is going to be better than readdir(3) on Unix.

    At the NT layer, beneath FindFirstFile/FindNextFile, there is a call that says "fill this buffer with directory entry metadata." - https://docs.microsoft.com/en-us/windows/desktop/devnotes/nt... - I know FindFirstFileEx for example can let you ask for a larger buffer size to pass to that layer, thereby reducing syscall overhead in a big directory.

    If you look at getdirentries(2) on FreeBSD for example - https://www.freebsd.org/cgi/man.cgi?query=getdirentries - it's a very similar looking API. I thought I recall hearing that in the days before readdir(3) the traditional approach was to open(2) a dir and read(2) it, but I cannot find a source for that claim. At any rate you can imagine something pretty identical in the layer beneath readdir(3) on a modern Unix-like system and it being essentially the same as what Windows does.

    I guess file size needs an extra stat(2) in Unix, since it is not in struct dirent, so if you do care about that or some of the other WIN32_FIND_DATA members the Windows way will be faster.

If by "host" you mean the client rather than the server, and if I understand correctly, the problem I anticipate would be that the API doesn't allow you to use that cached metadata, even if the client has already received it, because there's no guarantee that when you query a file inside some folder, it'll be the same as it was when you enumerated that folder, so I'd assume you can't eliminate the round trip without changing the API. Not sure if I've understood the scenario correctly but that seems to be the issue to me.