← Back to context

Comment by alanfranz

18 hours ago

Well...

I'm not sure what the author really wants to say. mmap is available in many languages (e.g. Python) on Linux (and many other *nix I suppose). C provides you with raw memory access, so using mmap is sort-of-convenient for this use case.

But if you use Python then, yes, you'll need a bytearray, because Python doesn't give you raw access to such memory - and I'm not sure you'd want to mmap a PyObject anyway?

Then, writing and reading this kind of raw memory can be kind of dangerous and non-portable - I'm not really sure that the pickle analogy even makes sense. I very much suppose (I've never tried) that if you mmap-read malicious data in C, a vulnerability would be _quite_ easy to exploit.

Actually in Python you could recast (zerocopy) bytearray as other primitive C type or even any other structure using ctypes module.

Creating memory mapped files is a very common OS feature since 90s. Many high level languages have it as OS agnostic POSIX or not.

  • > very common OS feature since 90s

    And if you want to go farther back, even if it wasn't called "mmap" or a specific function you had to invoke -- there were operating systems that used a "single-level store" (notably MULTICS and IBM's AS/400..err OS/400... err i5 OS... err today IBM i [seriously, IBM, pick a name and stick with it]) where the interface to disk storage on the platform is that the entire disk storage/filesystem is always mapped into the same address space as the rest of your process's memory. Memory-mapped files were basically the only interface there was, and the operating system "magically" persisted certain areas of your memory to permanent storage.