← Back to context

Comment by whartung

3 months ago

This may fall in the "you think you do, but you don't category", but I've always wanted a Smalltalk (or similar, not that picky) with a persistent virtual memory.

That is, the VM is mapped to a backing file, changes persisted automatically, no "saving", limited by drive space (which, nowadays, is a lot). But nowadays we also have vast memory space to act as a page cache and working memory.

My contrived fantasy use case was having a simple array name "mail", which an array containing all of my email messages (in email object, of course). Naturally as you get more mail, the array gets longer. Also, as you delete mail, then the array shifts. It's no different, roughly, than the classic mbox format, save it's not just text, its objects.

You can see if you delete a email, from a large (several GBs), there would be a lot of churn. That implies maybe it's not a great idea to use that data structure, but that's not the point. You CAN use that data structure if you like (just like you can use mbox if you like).

Were it to be indexed, that would be done with parallel data structures (trees or hashes or whatever).

But this is all done automagically. Just tweaks to pages in working memory backed by the disk using the virtual memory manager. Lots and lot of potential swapping. C'est la vie, no different from anything else. This what happens when you map 4TB into a 16GB work space.

The problem with such a system, is how fragile is potentially is. Corrupt something and it happily persists that corruption, wrecking the system. You can't reboot to fix it.

Smalltalk suffers from that today. Corrupt the image (oops, did I delete the Object become: method again?), and its gone for good. This is mitigated by having backup images, and the changelist to try to bring you back to the brink but no further.

I'm guessing a way to do that in this system is to use a copy on write facility. Essentially, snapshot the persistent store on each boot (or whatever), and present a list of previous snapshot at start up.

Given the structure of a ST VM you'd like to think this is not that dreadful to work up. I'd like to think a paper napkin implementation PoC would be possible, just to see what it's like. One of those things were the performance isn't really that great, but the modern systems are so fast, we don't really notice it in human terms.

But I do think it would be interesting.

Smalltalk suffers from mis-information —

> oops, did I delete the Object become: method again?), and its gone for good.

And then you admit actually it's not gone for good because if you created the method that will be recorded in the changes.log file and if it was a provided method that will still be in the provided sources file.

https://cuis-smalltalk.github.io/TheCuisBook/The-Change-Log....

Have you looked at Pharo? Their git integration makes it relatively easy to export and backup parts of your main image, and to pull the things back into a fresher one once you mess up.