← Back to context

Comment by vkazanov

4 days ago

It's used in data science because no other language has this level of library support.

And it got this unprecedented level of support because right from the start it made its focus clear syntax and (perceived) simplicity.

There is also a sort of cumulative effect from being nice for algorithmic work.

Guido's long-term strategy won over numerous other strong candidates for this role.

I think the key thing not obvious to most data scientists is they're not using python because it meets their needs, it's because we've failed them. twice.

1. data scientists aren't programmers, so why do they need a programming language? the tools they should be using don't exist. they'd need programmers to make them, and all we have to offer is... more programming languages.

2. the giant problem at the heart of modern software: the most important feature of a modern programming language is being easy to read and write. this feature is conspicuously absent from most important languages.

they're trapped. they can't do what they need without a programming language but there are only a handful they can possibly use. the real reason python ended up with such good library support is they never really had a choice.

  • When the first scientific libraries were written for python, most alternatives didn't even consider being readable, or convenient. The choice was more like C/Cpp/Fortran vs Python.

    And then Python went into a self-reinforcing loop, with scientific community coming up with more and more ways to improve Python support for the kind of interactive work that was required for data analysis. Think ipython -> jupyter -> jupyter forks and other python-centric notebook systems.

    So when data analysis evolved into data science and machine learning, gpu-first library vendors already faced a crowd of people knowing python.

    It is crazy how right now one can utilize 100s of gpus through these bits of dirty python wrapped in json.

    • I think you're forgetting perl (plus other unix utils) and matlab. PDL (perl data language) was a thing, as was IDL (and other similar tools).