Comment by a1369209993

7 years ago

I think the idea is that you're accessing files sparsely and/or randomly.

With the linux approach you avoid translating (from disk representation to syscall representation) metadata you don't need, and the in-memory disk cache saves having to re-read it (and some filesystems require a seek for each directory entry to read the inode data structure, which can also be avoided it you don't care about that particular stat).

With the windows approach, the kernel knows you want multiple files from the same directory, so it can send a (slightly more expensive) bulk stat request, using only one round trip[0]. On linux, the kernel doesn't know whether you're grabbing a.txt,b.txt,... (single directory-wide stat) or foo/.get,bar/.git,... (multiple single stats that could be pipelined) or just a single file, so it makes sense to use the cheapest request initially. If it then sees another stat in the same directory, it might make a bulk request, but that still incurred a extra round trip, and may have added useless processing overhead if you only needed two files.

TLDR: Access to distant memory is faster if assuptions can be made about your access patterns, access to local memory is faster if you access less of the local memory.

0: I'm aware of protocol-induced round trips, but I don't think it effects the reasoning.

1 comment

a1369209993

smartstakestime 7 years ago

just think of they way the OSs are used.

get to a directory:

linux: cd /dir (no info) windows: open directory ... all the info and different views depending on your selection currently like image file thumbnails

in windows you are always accessing this meta data so it makes sense to speed it up. while in linux even the ls fucking does give you meta data you have to add the extra options so it doesnt makes sense to speed up and waste storage on something that is infrequent

seem like both ways are sound