Comment by 2RTZZSro
8 years ago
Would it be feasible to keep a set of Python interpreters around at all times and use a round robin approach to feed each already-on interpreter commands then perform an interpreter environment cleanup out-of-band after a task is complete?
The Java ecosystem had this with Drip and I think it turned out to not be a great idea in practice - the magazine of VMs get exhausted when you don't want it to, they get into odd states and other things I think, can't quite remember.
Or just use the operating system's `fork` system call?
There's also nailgun for Java which sounds like it works a little differently: http://martiansoftware.com/nailgun/
I guess a fork()'ed process triggers copy-on-write behavior in the kernel once the process starts running. So that's latency (the copying) you could still optimize away.
A common solution for web application servers ("preforking").
The idea of keeping persistent interpreters doesn't really work for Python because the interpreter is full of state in places you'd never expect -- it's hard to reset the interpreter to a sane state after it ran some unknown program.
I may be wrong, but I would bet that copy-on-write of pages would be hardly visible for most workloads. Copying is quite fast when you do it in batches (4k per page).
You might want to measure it before you optimize it! Oftentimes I find that forks where I don't write much are quite inexpensive, with little COW action.
1 reply →
Kind of like this?
https://github.com/tbug/aiochannel
I also think David Baezlys curio has a wonderful way of explaining the same concept
http://curio.readthedocs.io/en/latest/reference.html#module-...
Asyncio has enabled his sort of programming a lot easier. You could do the same thing I imagine with multiprocessing and Threadpools I imagine
Yes but with the added complexity and resource usage it's not a good general solution. If every app behaved this way we'd be in a worse place overall.
I imagine this could be handled by some kind of "fork". Where you instantly duplicate the whole process with copy-on-write.
What would be really nice is checkpoint and restart (i.e., unexec), but it turns out that it's extremely hard to implement and get right in a non-managed environment.