← Back to context

Comment by chubot

8 years ago

I don't think it's fixable in Python unfortunately. As someone else pointed out, the fact that it got WORSE in Python 3, and not better, is a bad sign. Python 3 was the one chance to fix it -- to introduce breaking changes.

As I mentioned, this problem has bugged me for a long time, since at least 2007. Someone else also mentioned the problem with Clojure, and with JIT compilers in general. I'm interested in Clojure too, but my shell-centric workflow is probably one reason I don't use it.

In 2012 I also had the same problem with R, which starts even more slowly than Python. I wrote a command line wrapper that would keep around persistent R processes and communicate with them. I think I can revive some of my old code and solve this problem -- not in Python, but in the shell! Luckily, I'm working on a shell :)

http://www.oilshell.org/

In other words, the solution I have in mind is single-threaded coprocesses, along with a simple protocol to exchange argv, env, the exit code, and stdout/stderr (think CGI or FastCGI). Coprocesses are basically like servers, but they have a single thread to make porting existing apps easy (i.e. turning most command line apps into multi-threaded servers is nontrivial).

If you're interested I might propose something on Zulip. At the very least I want to dig up that old code.

http://www.oilshell.org/blog/2018/04/26.html

I think it's better to solve this problem in shell than Python/R/Ruby/JVM. There's no way all of them will be fixed, so the cleaner solution is to solve it in one place by introducing coprocesses in the shell. I will try to do it in bash without Oil, but it's possible a few things will be easier in Oil.

> introducing coprocesses in the shell

I did this with bash and Python a few years ago when I learned about the "coproc" feature (which, by the way, only supports a single coprocess per bash process, unless I misunderstood it).

But it turns out I tend to open new terminal windows a lot, which meant that the coprocess needs to relaunch all the time anyway, so it wasn't very useful. Even if I start it lazily, to avoid slowing down every shell startup, most of my Python invocations tend to be from a new shell, so there was no real benefit.

Maybe if you have a pool of worker processes that's not tied to any individual shell process, and connect to them with by a Unix-domain socket or something...

  • Hm yeah I was hoping to do it in a way that's compatible with unmodified bash, but maybe it will only be compatible with Oil to start.

    Basically I think there should be a "coprocess protocol" that makes persistent processes look like batch processes, roughly analogous to CGI or FastCGI.

    I thought that could be built on top of bash, but perhaps that's not possible.

    I'll need to play with it a bit more. I think in bash you can have named multiple coprocesses with their descriptors stored in an array like ${COPROC[@]} or ${MY_COPROC[@]}. But there are definitely issues around process lifetimes, including the ones you point out. Thanks for the feedback.

    • I looked into the issue with multiple coprocesses in bash again to make sure. While you would naturally think you could simply give them different names, it's unfortunately not supported:

      https://lists.gnu.org/archive/html/bug-bash/2011-04/msg00059...

      Bash will print a warning when you start the second one, and the man page explicitly says at the bottom: "There may be only one active coprocess at a time."