← Back to context

Comment by andix

1 day ago

Isn't it really obvious that a user space fs will always be slow, and especially slow with small files?

I don't know the purpose of microsandbox, but such an article doesn't give me great confidence in exploring it further.

> Isn't it really obvious that a user space fs will always be slow, and especially slow with small files?

Small files seem like the perfect case for a user space fs... depending on what you mean by user space fs. If you mean interfacing with a FUSE (or similar) filesystem by using syscalls in your program to context switch into the kernel, then context switch to the userspace FUSE layer, then send it back to the kernel and then back to your program ... that will be especially slow with small files where date bytes per context switch is small. OTOH, if you mean a user space fs where your program has a built in filesystem it can access without context switching, then that will be of benefit ... especially if the files are small enough that you can pack multiple files into a single page.

  • If I got the article right, they were running FUSE inside the VM, and the VM's FUSE was talking to a process on the VM host (probably over virtual network). That can't be fast, not even theoretically.

    Ever access to the fs had to go through two processes and two kernels, virtual networking, and probably even running on two different cores most of the time. That must be slow.

Not obvious to me.

A filesystem is a database, it is a type of key value storage. there is a hierarchical lookup key that points you to an unstructured block of data.

Many databases(berkleydb, postgresql, sqlite) are then built on this unstructured database. There is absolutely nothing indicating that putting putting a key value database with hierarchical keys and unstructured blocks in a single file will be slow.

It could be, naive indexing or rebalancing could be very slow. But it does not have to be. In fact berkleydb is a neat case study here. superficially it is a ridiculously simple key value store, why does such a simple thing even need to exist, or have such a long lived presence. It turns out building the efficient structures needed to work with slow non-volatile storage is non-trivial. Early mysql used berkleydb as a low level storage engine. Note that mysql main selling point was speed before correctness.

See also: Virtual machines another ubiquitous case of a filesystem in userspace.

  • It's something completely different. A database like SQLite runs in the same process as the application

    All the filesystem calls go through the kernel. A userspace file system is another process.

    It's not like SQLite, it's more like Postgres. Try sending a few hundred thousand small queries to Postgres, and be surprised how slow it's going to be.

    The file system api is not like sql that allows complex queries. It's a lot of tiny and simple requests that assume very low latency.

Sqlite is essentially a user space queryable file system and it can be faster than writing to file system directly while working with small files.

  • SQLite is in-process. A user space file system is another process. Like Postgres if we want to compare fs with dbs. And Postgres is slow for many small queries, like a userspace file system.