← Back to context

Comment by quotemstr

8 years ago

I've always been disappointed by how large software projects, both FOSS and commercial, lose their "can do" spirit with age. Long-time contributors become very quick with a "no". They dismiss longstanding problems as illegitimate use cases and reject patches with vague and impervious arguments about "maintainability" or "complexity". Maybe in some specific cases these concerns might be justified, but when everything garners this reaction, the overall effect is that progress stalls, crystallized at the moment the last bit of technical boldness flowed away.

You can see this attitude of "no" on this very HN thread. Read the comments! Instead of talking about ways we can make Python startup faster, we're seeing arguments that Python shouldn't be fast, we shouldn't try to make it faster, and that programs (and, by implication, programmers) who want Python startup to be fast are somehow illegitimate. It's a dismal perspective. We should be exercising our creativity as a way to solve problems, not finding creative ways to convince ourselves to accept mediocrity.

This isn't an attitude of "no" - it's an attitude of "yes" to other things. The arguments are that making Python startup fast makes other things worse, and we care about those other things.

Here are some other things we can say "yes" to:

- Rewrite as much of Mercurial in Rust as possible, which will provide performance improvements well beyond what Python can possibly offer. https://www.mercurial-scm.org/wiki/OxidationPlan

- Spend resources on developing PyPy, which (being a JIT) has relatively slow startup but much faster performance in general, for people who want fast performance.

- Write compilers from well-typed Python to native code.

- Keep CPython easy to hack on, so that more people with a "can do" spirit can successfully contribute to CPython instead of it being a mess of special cases in Guido's head.

Will you join me in saying "yes" to these things and not convincing ourselves to accept mediocrity?

  • I have to note that none of the projects you suggested, all of which are good and useful, will do anything to address cpython startup latency problem under discussion. Why shouldn't cypthon be better?

    There's also no reason to believe that startup improvements would make the interpreter incomprehensible; the unstated assumption that improvements in this area must hurt hackability is interesting. IME, optimizations frequently boost both simplicity and performance, usually by unifying disparate code paths and making logic orthogonal.

    • I think you misunderstood the point. These weren't things that would address the cpython startup problem - these were other priorities that can be worked on, instead of (or in addition to) the latency problems under discussion.

      Saying yes to fixing one thing usually means saying no to all the other things you can be doing with your time instead. Unless you're lucky and can "kill 2 birds with 1 stone".

      5 replies →

    • > Why shouldn't cypthon be better?

      The point is that "better" is almost never a well-defined direction unless you only consider a single use-case. It's almost always a tradeoff, especially in a widely used project.

      A language is a point on a landscape of possible language variants, and "better" is a different direction on that landscape for every user.

    • > cpython startup latency problem

      The problem under discussion is that projects that currently use CPython have slow startup. One potential solution is for those projects not to use CPython. (Certainly it's not the only potential solution, but, a language that tries to be all things to all people isn't going to succeed. Python has so far done an extraordinarily good job of being most things to all people, with "I want native-code performance" being one of the few out-of-scope things.)

    • >IME, optimizations frequently boost both simplicity and performance, usually by unifying disparate code paths and making logic orthogonal.

      I would really like to see some examples where this is the case. Optimizations in my experience have made systems more brittle, less portable and ultimately less maintainable.

      2 replies →

  • I'm so longing for a Python(like) compiler.

    MicroPython put together a Python in 250kb. Why the hell can't we make an LLVM frontend for Python that can use type hints for optimization? Sure, you lose some dynamic features as you optimize for speed, but that's the dream. Quickly write a prototype, not caring about types, optimize later with adding types and removing dynamicism.

    I'm currently learning Racket and LLVM and I have about 70 more years to live. I'm gonna try make Python fast on slow weekends 'til I die.

  • > PyPy, which (being a JIT) has relatively slow startup

        > time pypy -c 'print "Hello World"'
        Hello World
        pypy -c 'print "Hello World"'  0.08s user 0.04s system 96% 
        cpu 0.120 total
    
        > time luajit -e 'io.write("Hello World!\n")'
        Hello World!
        luajit -e 'io.write("Hello World!\n")'  0.00s user 0.00s 
        system 0% cpu 0.002 total

    • Sometimes I wonder why we're not all using Lua instead of Python. Lua seems to get a strange amount of hate in some circles, but I've found both Lua and Python to be reasonably pleasant languages to work with.

      8 replies →

  • None of these address, for instance, the issue raised about the firefox build invoking python many times. This seems both an accepted use case of cpython and an area where traditionally cpython has a huge edge on the JVM and PyPy. If scripts are not a priority, what is the expected use case of cpython?

    I would like to note, the cpython ties to the PyObject C abi seem to stymie rather than encourage “hacking”. Cpython seems to have traditionally valued stability over all else.... see the issues pypy has had chasing compatibility with c and retaining speed.

    So: normally i’m with you and a language should lean into its strengths, but i’ve always listed startup time as a primary strength of python!

    • In my experience, "script" is usually well-correlated with "a bit of inefficiency is okay." There's a reason that, say, many UNIX commands that could be implemented as scripts (true, false, yes) are actually implemented as binaries. There's a reason that most commercial UNIXes/clones (Solaris, macOS, Ubuntu, RHEL, etc.) switched from a script-based startup mechanism to a C-program-based one.

      I certainly write and continue to write Python scripts where even an extra half second won't matter. It's doing some manipulation of data where the cost of what it's doing is dominated by loading the data (e.g., grabbing it from some web service), and even if the script is small and quick, it's not so small and quick that I'll notice 50-100 ms being shaved off of it.

      Use cases where CPython continues to make sense to me are non-CGI web applications and things like Ansible, where load time isn't sensitive to milliseconds and runtime performance is pretty good. (Although if you believe the PyPy folks, perhaps everything that's PyPy-compatible should be running on PyPy.)

      1 reply →

  • This hits the nail on the head.

    Optimization is very, very rarely completely „free“ - and usually a concious trade of some property for another trait that‘s deemed more important in a specific case.

    Simplicity for performance. Code size for compilation speed. Startup time for architectural complexity. UX for security.

    For a great product, you need to say „no“ much more often than not. Do one thing and do it well. Be Redis, not JBoss.

    I love how this article gets down to the essence of it: https://blog.intercom.com/product-strategy-means-saying-no/

  • > Rewrite as much of Mercurial in Rust as possible, which will provide performance improvements well beyond what Python can possibly offer. https://www.mercurial-scm.org/wiki/OxidationPlan

    I read that article and I'm still wondering: why Rust?

    • The last three paragraphs of the section "Why use Rust?" should address that - basically, they have experience with solving this problem by writing parts of the code in C, they are not fans of that experience, and Rust is a compelling better C (and there are specific reasons they don't think C++ is compelling).

      Are you asking in comparison to some other language? The most obvious other languages in the "compelling better C" niche I think are Ada, D, and Go; Ada and D (I think, I do not know them well) don't have as good of a standard library or outside development community, and Go is less suited to Rust to replacing portions of a process. Go would be a reasonable choice were one writing a VCS from scratch today.

      12 replies →

  • I agree with you. As the limitation of developing resources, say "no" is difficult but important.

  • I am slightly afraid to ask, but what is a "well typed python"?

    • Have you seen MyPy, static type annotations / checking for Python? http://mypy-lang.org/

      In context, what I'm really getting at "a sufficiently non-dynamic subset of Python that it can be compiled statically, but also a sufficiently large one that real Python programs can have a chance of being in the subset." PyPy has a thing called RPython that fits the former but not really the latter (I don't know of any non-PyPy-related codebases that work in RPython). In general, adding complete type annotations to a codebase is pretty correlated with making it static enough to do meta-level things on like compiling and optimizing it - for instance if you have a variable that changes types as the program runs, at least now you've enumerated its possible types. It's not the only way of doing so, but it seems to work well in practice and there seems to be a correlation between compiled vs. interpreted languages and static vs. dynamic typing.

  • correct. Every time you say yes, your saying no to something else. Its important to realize what your saying no to, before you say yes.

  • It's funny when developers themselves think effort is so fungible. Like if you spent 1 hour on A, then you would've also made 1 hour of progress on B, C, or D, and that it would've been worthwhile. To the point of fallacy in your post.

    I would think developers have the experience to realize this isn't true but I see it all the time on these forums.

    • I think I'm making the opposite claim - effort isn't fungible (and availability of effort isn't fungible). You can't necessarily spend 1 hour that would otherwise go into, say, rewriting Mercurial into a compiled language and instead spend it on making CPython faster and get the same results. One of these is more likely to work, and also the two problems are going to attract interest from different people.

      And one of the things that affects how productive one hour of work will be - and also whether random volunteers will even show up with one hour of work - is the likelihood of getting a change accepted and shipped to users. This is influenced by both the maintainers' fundamental openness to that sort of change, and any standards (influenced by the maintainers, who are in turn influenced by their users) about how careful a change must be to not make the project worse on other interesting standards of evaluation. It's also influenced by the number of people working on the project (network effects) because a more vibrant project is more likely to review your code promptly, finish a release, and get it into the hands of more users.

      So I'm claiming that it's better to spend time on rewriting Mercurial in Rust than to spend time on getting CPython startup faster, because the Mercurial folks are actively interested in such contributions and the CPython folks are actively uninterested, and because there are fewer external constraints in making Mercurial startup faster than in making CPython startup faster. And I'm saying that the more we encourage folks to help with rewriting Mercurial in Rust, the more likely additional folks are to show up and help with the same project, thereby making 1 hour of effort even more productive.

  • > This isn't an attitude of "no" - it's an attitude of "yes" to other things.

    You are literally bringing an attitude of "no" to the question of whether you are being an attitude of "no" to the discussion....

    • If those who complain about no-attitudes are insisting that the only acceptable response to anything is "yes", I doubt they'll get far.

FWIW no one who replied to this email thread said something even close to "no". Victor Stinner points out that startup time is something that comes up a lot and mentions some recent work in the area [1].

Python is a big ship, it may not be as nimble as a young FOSS project but it is always improving and investments in things like start up time pays dividends to a large ecosystem.

[1] https://mail.python.org/pipermail/python-dev/2018-May/153300...

I get the impression that backwards-compatibility does weigh pretty heavily on the Python core developers these days. There are so many Python installations out there doing so much that the default answer to a change has to be "no". The fact that macOS and popular Linux distributions ship with copies of Python is great, but once something is effectively a component of operating systems, boldness is not a viable strategy. Arguably, one of the reasons why the transition to Python 3 has been so drawn out is that every time somebody installs macOS or one of many Linux distributions, a new Python 2 system is born. I've seen .NET Core developers explain that having .NET Framework shipped in Windows put them under massive constraints, and this was one of the motivations for a new runtime.

I'm not denying this phenomenon, but part of it is surely that widely used projects get more conservative because any change risks breaking something for someone somewhere. And the maintainers tend to feel a sense of responsibility to help people deal with these breakages.

I'll bring a slightly different perspective, as someone who's been using Python professionally for over a decade: there is no such thing as just saying "yes" or "no". Every "yes" to one group is at least an implicit "no" to some other group, and vice-versa.

The Python 2/3 transition is a great example of this. Python 2 continued an earlier tradition of saying "yes" to almost everything from one particular group of programmers: people working on Unix who wanted a high-level language they could use to write Unix utilities, administrative tools, daemons, etc. In doing that, Python said "no" to people in a lot of other domains.

Python 3 switched to saying "yes" to those other domains much more often. Which came with the inherent cost of saying "no" (or, more often, "not anymore") to the Unix-y crowd Python 2 had catered to. Life got harder for those programmers with Python 3. There's been work since then to mitigate some of the worst of it, but some of the changes that made Python nice to use for other domains are just always going to be messy for people doing the traditional Unix-type stuff.

Personally, I think it was the right choice, and not just because my own problem domain got some big improvements from Python 3. In order to keep growing, and really even to maintain what it already had, Python had to become more than just a language that was good for traditional Unix-y things. Not changing in that respect would have been a guaranteed dead end.

This doesn't mean it has to feel good to be someone from the traditional Unix programming domain who now feels like the language only ever says "no". But it does mean that it's worth having the perspective that this was how a lot of us felt in that golden age when you think Python said "yes" to everything, because really it was Python saying "yes" to you and "no" to me. And it's worth understanding that what feels like "no" doesn't mean the language is against you; it means the language is trying to balance the competing needs of a very large community.

  • "people working on Unix .... In doing that, Python said "no" to people in a lot of other domains."

    Could you elaborate on this?

    I thought Python was pretty good about supporting non-Unix OSes from early on. It was originally developed on SGI IRIX and MacOS. From the README for version 0.9:

    > There are built-in modules that interface to the operating system and to various window systems: X11, the Mac window system (you need STDWIN for these two), and Silicon Graphics' GL library. It runs on most modern versions of UNIX, on the Mac, and I wouldn't be surprised if it ran on MS-DOS unchanged. I developed it mostly on an SGI IRIS workstation (using IRIX 3.1 and 3.2) and on the Mac, but have tested it also on SunOS (4.1) and BSD 4.3 (tahoe).

    though it looks like there wasn't "painless" DOS support until 1994, with the comment "Many portability fixes should make it painless to build Python on several new platforms, e.g. NeXT, SEQUENT, WATCOM, DOS, and Windows."

    I also thought that PythonWin had very good Windows support quite early on. The 1.5a3 release notes say:

    > - Mark Hammond will release Python 1.5 versions of PythonWin and his other Windows specific code: the win32api extensions, COM/ActiveX support, and the MFC interface.

    > - As always, the Macintosh port will be done by Jack Jansen. He will make a separate announcement for the Mac specific source code and the binary distribution(s) when these are ready.

    • So, take the Python 3 string changes as an example.

      Python 2 scripting on Unix was great! Python just adopted the Unix tradition of pretending everything is ASCII up until it isn't, and then breaking horribly. And then the Linux world said "just use UTF-8 everywhere!" and really meant "just keep assuming things are ASCII, or at least one byte per code point, and break horribly when it isn't!"

      This was great for people writing command-line scripts and utilities. This was a nightmare for people working in domains like web development.

      Python 3 flipped the script: now, the string type is Unicode, and a lot of APIs broke immediately under Python 3 due to the underlying Unix environment being, well, kind of a clusterfuck when it came to locales and character encoding and hidden assumptions about ASCII or one-byte-per-character. Suddenly, all those people who had been using Python 2 -- which mostly worked identically to the way popular Linux distros did -- were using Python 3 and discovering the hell of character encoding that everybody else had been living in, and they complained loudly about it.

      But for growing from a Unix-y scripting language into a general-purpose language, this change was absolutely necessary. Programmers should have to think about character encoding at their input/output boundaries, and in a high-level language should not be thinking of text as a sequence of bytes. But this requires some significant changes to how you write things like command-line utilities.

      This is an example of a "yes" to one group being a "no" to another group. Or, at least, of it feeling that way.

      Also, saying that Python was a great Unix-y language is not equivalent to "Python only ran on Unix and never supported Windows at all", and you know that, so it was kind of dishonest of you to try to start an argument from the assumption that I said the latter when really I said the former. Don't do that again, please.

      7 replies →

    • The biggest thing is likely the str/unicode change. In py2, if working on a unix system with only ascii, you never had to think about strings. Suddenly with python3, you had to a little bit.

      The gain was that for everyone else (read: web, non-English, anywhere where unicode is common), python became much easier to use. But for those specific english-unix-ascii cases, it was a mild inconvenience.

      Edit: as ubernostrum pointed out, more than a mild inconvenience if you were porting code. If writing new code, it was not much worse, but porting was absolutely a pain.

      2 replies →

  • That's a nice sounding comment, but... could you be a little more specific about what particular "traditional Unix-y things" did Python 3 say "no" to?

    ...I can't really think of many, if any at all. Sometimes you just say "no" to "inertia".

I think part of what explains this attitude in people is "lack of imagination". In the sense that sometimes, especially when an existing project or organization or bureaucracy has become huge and daunting, people cannot imagine excellence anymore, so they believe it to be literally impossible.

To be fair, they are frequently saying no to things other people think they should do (rather than saying no to things like contributions of startup improvements).

We is very abstract term I am sure if you proposed a patch that addressed the issue without adverse side-effects it would get accepted.

I think your comment is well-intentioned (I upvoted) but I respectfully disagree. I think wanting Python to be a bit faster is similar to wanting Haskell to have a little bit of mutability. Engineering with restrictions is a good thing, we can do great systems in Haskell because it's a very neat language even though it lacks mutability. We also can do great systems in Python because it's a very neat language even though it's a bit slow. Sure, you can always optimize Python's performance, that's a legitimate problem and it takes a few engineers to solve it. But it's more interesting to work around Python's slowness by engineering tricks such as better algorithms etc.

  • That's not a great analogy. Haskell is a neat language in part because it doesn't have mutability. Python is a neat language despite being slow.

    I can't imagine anyone would object if Python could magically be 10x faster. I can't say the same thing for the Haskell thing.

    • My whole point is that 10x thing cannot just magically happen. The reason Python is slow is not incompetent programming or lack of magic. We know why that's happening. Because every variable in Python interpreter is a hash map and pretty much every operation is a hash map lookup. How do you optimize this? The only way is to remove language features like `setattr` and my whole point is that some people use Python because it's flexible enough to do that so they need their `setattr`.

      4 replies →

  • > Sure, you can always optimize Python's performance, that's a legitimate problem and it takes a few engineers to solve it. But it's more interesting to work around Python's slowness by engineering tricks such as better algorithms etc.

    Surely you're not implying that improving Python's performance would preclude finding interesting algorithms, nor that this is a suitable rationale for keeping Python slow? Anyway, algos can only get you so far when they're built on slow primitives (all data scattered haphazardly across the heap, every property access is a hash table lookup, every function call is a dozen C function calls, etc).

  • > I think wanting Python to be a bit faster is similar to wanting Haskell to have a little bit of mutability

    I'm sorry but that makes zero sense. Haskell is defined by immutability. People want to use haskell because of that characteristic. I don't want to use python because it is slow.

    • Sorry I disagree. There is definitely a sense in which Haskell is desirable because it is immutable, e.g. I myself love immutable data structures, it certainly makes it desirable. But my point is that it puts a restriction. Now you cannot implement algorithms that need mutability such as hash maps. It is easy to circumvent such problems but one other way is basically introducing mutability to Haskell which totally doesn't make sense. I think same goes for python, if you want to make it significantly faster then you need to face certain trade-offs: maybe data model should be optimized, or maybe `int` shouldn't be arbitrary precision integers, or maybe there should be primitive types like `int`, `double` as in Java to increase performance. Truth is these are not Pythonic solutions, and just like mutability is not Haskell-esque, optimizing Python sacrifising these trade-offs is not Pythonic.

      1 reply →

This is why large companies like Google often reinvents the wheel. Open source gives everyone the right to use, but not the power of control. Sure, you can fork, but then your version will diverge from the official, and the pain of maintaining compatibility may be greater than writing your own from scratch.

It's a byproduct of how many people you have to answer to. I was kind of having discussion with a coworker about an app that had a lot of features that made it seem kind of cluttered but useful. I think small projects can make bolder choices and enable more options because they have a smaller userbase that would be impacted by their changes and they want to be able to reach more people so adding a feature is generally a net benefit. But a larger project cannot risk hurting the large userbase they have already established so they have to be more cautious about the changes that they make.

I've always been disappointed at how quickly people make sweeping generalizations from a single anecdote. (I also think Python can do better here, but the generalization isn't justifiable.)

With major infrastructure like Python there's a tendency to over-emphasise compatibility between releases.

Look at this post in the same list thread: https://mail.python.org/pipermail/python-dev/2018-May/153300...

Python 3.6 is trying an enormous number of potential paths that code for imports might be found at. Why is that fixed in stone? Couldn't Python 3.(n+1) change that, if it's slow and historical, cutting out a bunch of slow system calls?

As someone who makes use of Python to deploy software, it's entirely possible that could cause me a few issues... which I'd fix quite easily. It should be totally reasonable to expect the community using the software to cope with those sorts of changes after a major release; the alternative is ossification.

Django suffered from maintaining too much compatibility, and releasing too slowly, and they fixed it. Three or four years ago everyone was talking about moving away from it; now they release often, deprecate stuff when they need to, and the project is as vibrant as it ever was. Time for cPython to learn the same lesson.

Everyone is focusing on python, but where is this "can do" spirit from mozilla? Their are languages with better startup times, bash, perl, lua, awk to name a few, and could likely do whatever the python scripts are doing.

  • > but where is this "can do" spirit from mozilla?

    The mail included both

    > At some point, we'll likely replace Python code with Rust so the build system is more "pure" and easier to maintain and reason about.

    and

    > Since I am disproportionately impacted by this issue, if there's anything I can do to help, let me know.

For a large enough project or area of concern, saying "no" to many things is essential to saying "yes" to anything.

Python3 has the exact opposite problem: Too many devs willing to say "yes" to features and a small number of devs who try to keep things fast and maintainable.

Remember that Python2 was faster.

This is true but python's relative slowness (along with the GIL) is an issue that is regularly blown out of all proportion.

Part of the reason for the language's success is because it made intelligent tradeoffs that often went against the grain of the opinions of the commentariat and focused on its strengths rather than pandering to the kinds of people who write language performance comparison blog posts.

If speed were of primary importance then PyPy would be a lot more popular.

  • You're conflating two kinds of "performance", startup latency and steady state throughput. We're talking about the former, and you're proposing improvements for the latter. In fact, moving to pypy is exactly what you shouldn't do to improve startup.

    It's surprising but frequently true that startup latency has a greater effect on the perception of performance than actual throughput. Nobody likes to type a command and then be kept waiting, even if the started program could in principle demonstrate amazing feats of computation once warmed up.

  • The GIL is a pretty nasty problem once you try to scale things beyond one core.

    Simply try something like unpickling a 10 GB data structure while keeping your GUI in the main thread responsive. You cannot do that because the GIL locks up everything while modifying data structures. Move the data to another process instead of another thread. Great, your GUI is responsive but you can't access the data from the main thread.

    You can say that such a humongous data structure is wrong or that a GUI isn't meant to be responsive or programmed in Python or that I'm holding it wrong. Probably right.

    • I've flailed around with this a few times in the last year or so and have found that posting things up and down a multiprocessing.Pipe is the least painful alternative.

      1 reply →

    • "You're holding it wrong" is a poor response to a wide audience, like iPhone users. But it's an OK response to a specialist, like someone tackling the task you describe.

      8 replies →

  • Python derives a good chunk of its speed (if not all of it) from carefully tuned libraries written in other languages (or even for other architectures in the case of many machine learning packages). As soon as you try to do a lot of heavy processing python even the compiled versions quickly bog down. IMO the best way to use python is to use it to cleverly glue together highly optimized code. That way you spend the minimum amount of effort and you get maximum performance.