← Back to context

Comment by travisgriggs

9 days ago

I loved the idea of QNX. Got way excited about it. We were moving our optical food processor from dedicated DSPs to general purpose hardware, using 1394 (FireWire). The process isolation was awesome. The overhead of moving data through messages, not so much. In the end, we paid someone $2K to contribute isochronous mode/dma to the Linux 1394 driver and went our way with RT extensions.

It was a powerful lesson (amongst others) in what I came to call “the Law of Conservation of Ugly”. In many software problems, there’s a part that just is never going to feel elegant. You can make one part of the system elegant, which often causes the inelegance surface elsewhere in the system.

> what I came to call “the Law of Conservation of Ugly”. In many software problems, there’s a part that just is never going to feel elegant

This may be an instance of the Waterbed Principle: in any sufficiently-complex system, suppressing or refactoring some undesirable characteristic in one area inevitably causes an undesirability to pop up somewhere else. Like there is some minimum amount of complexity/ugliness/etc that it is possible for the entire system to contain while still carrying out its essential functions, and it must leak out somewhere.

https://en.wikipedia.org/wiki/Waterbed_theory

  • The terms I've seen used and prefer to use are "essential complexity" and "accidental complexity".

I have a really neat idea to improve the message passing speed in QNX: you simply use the paging mechanism to send the message. That means there is no copying of the data at all, just a couple of page table updates. You still have the double TSS load overhead (vs 1 TSS load in a macro kernel), but that is pretty quick.

But you are right that there is a price for elegance. It becomes an easier choice to make when you factor in things like latency and long term reliability / stability / correctness. Those can weigh much heavier than mere throughput.

  • This is sort of what Mach does with "out-of-line" messages: https://web.mit.edu/darwin/src/modules/xnu/osfmk/man/mach_ms... https://dmcyk.xyz/post/xnu_ipc_iii_ool_data/

    (this is used under-the-hood on macOS: NSXPCConnection -> libxpc -> MIG -> mach messages)

    • Mach has always been a very interesting project. It doesn't surprise me at all to see that they have this already, but at the same time I was not aware of it so thank you. This also more or less proves that that may well be an avenue worth pursuing.

      1 reply →

  • I haven't seen it implemented anywhere, but that sounds like the "pagetable displacement" approach described here: https://wiki.osdev.org/IPC_Data_Copying_methods#Pagetable_di...

    The same idea occurred to me a while ago too, which is how I originally found that link :)

    • How performant is that in practice? I thought setting pages was a fairly expensive process. Using a statically mapped circular buffer makes more sense to me at least.

      Disclaimer: I don't actually know what I'm talking about, lol

      3 replies →

  • Passing the PTE sounds great for big messages (send/recv).

    For small messages (open), the userspace malloc is going to have packed small buffers into a single page - so there's a chance you'd need to copy to a new userspace page, the two copies might work out better.

    • The throughput limitation is really only an issue for big messages, for smaller ones the processing overhead will dominate.

  • The QNX call to do that is mmap().

    • Yes, I know. But I rolled my own QNX clone and I figured it would be neat to do this transparently rather than that the application has to code it up explicitly. This puts some constraints on where messages can be located though and that's an interesting problem to solve if you want to do it entirely without overhead.

      3 replies →

Is "optical food processor" a metaphor, or is this actually a device that would cut up food items based on image feedback?

  • Usually it's about sorting. Take a lot of whatever (french fries, green beans, etc), accelerate them to something like 3 m/s, launch them off the end of a belt, scan them, looking for defects, and then use air jets to divert the defective items. Look on you tube for it. It's sort of mind boggling to see the scale at which french fries alone are produced. You see one line running at load, and then realize there are multiple lines in most plants, and there are hundreds of plants world wide. It's mind boggling.

    The cooler machines were specialized for fries, they use a rotating knife drum above a belt to cut defect spots from fries.

    I've not done that for 17 years now; the newer machines are that much cooler.

    • That's awesome. Thanks for the explanation.

      I did find several machines like this on YouTube, and it's amazing to watch. (One of them had little motor-actuated slats that could kick the defective items away, almost like a foot kicking a soccer ball!)

There's an older talk Simon Peyton Jones (IIRC?) gave about some development or other in haskell, in which he suggested that many software systems have some aspect of the swamp or the marsh into which you must eventually wade - that there's a mucky, sticky, irreducible aspect to the problem that must be dealt with somewhere, regardless of how elegant the rest of the system is.

"that marsh thing" has stuck with me, and been a frequent contributor to my work and thinking. I'll happily take Law of Conservation of Ugly as a _much_ better name for the thought :)

Today though, I'd argue that with full DSMP support and much more capable systems, any overhead from message passing is much less of a concern, or at least outweighed by other benefits.