The most surprising part of uv's success to me isn't Rust at all, it's how much speed we "unlocked" just by finally treating Python packaging as a well-specified systems problem instead of a pile of historical accidents. If uv had been written in Go or even highly optimized CPython, but with the same design decisions (PEP 517/518/621/658 focus, HTTP range tricks, aggressive wheel-first strategy, ignoring obviously defensive upper bounds, etc.), I strongly suspect we'd be debating a 1.3× vs 1.5× speedup instead of a 10× headline — but the conversation here keeps collapsing back to "Rust rewrite good/bad." That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
It's not just greenfield-ness but the fact it's a commercial endeavor (even if the code is open-source).
Building a commercial product means you pay money (or something they equally value) to people to do your bidding. You don't have to worry about politics, licensing, and all the usual FOSS-related drama. You pay them to set their opinions aside and build what you want, not what they want (and if that doesn't work, it just means you need to offer more money).
In this case it's a company that believes they can make a "good" package manager they can sell/monetize somehow and so built that "good" package manager. Turns out it's at least good enough that other people now like it too.
This would never work in a FOSS world because the project will be stuck in endless planning as everyone will have an opinion on how it should be done and nothing will actually get done.
Similar story with systemd - all the bitching you hear about it (to this day!) is the stuff that would've happened during its development phase had it been developed as a typical FOSS project and ultimately made it go nowhere - but instead it's one guy that just did what he wanted and shared it with the world, and enough other people liked it and started building upon it.
I don't know what you think "typical Foss projects" are but in my experience they are exactly like your systemd example: one person that does what they want and share it with the world. The rest of your argument doesn't really make any sense with that in mind.
> You don't have to worry about politics, licensing, and all the usual FOSS-related drama. You pay them to set their opinions aside and build what you want, not what they want (and if that doesn't work, it just means you need to offer more money).
Money is indeed a great lubricator.
However, it's not black-and-white: office politics is a long standing term for a reason.
Sounds like you’re really down on FOSS and think FOSS projects don’t get stuff done and have no success? You might want to think about that a bit more.
nah, a lot of people working on `uv` have a massive amount of experience working on the rust ecosystem, including `cargo` the rust package manager. `uv` is even advertised as `cargo` for python. And what is `cargo`? a FLOSS project.
Lots of lessons from other FLOSS package managers helped `cargo` become great, and then this knowledge helped shape `uv`.
I largely agree but don't want to entirely discount the effect that using a compiled language had.
At least in my limited experience, the selling point with the most traction is that you don't already need a working python install to get UV. And once you have UV, you can just go!
If I had a dollar for every time I've helped somebody untangle the mess of python environment libraries created by an undocumented mix of python delivered through the distributions package management versus native pip versus manually installed...
At least on paper, both poetry and UV have a pretty similar feature set. You do however need a working python environment to install and use poetry though.
> the selling point with the most traction is that you don't already need a working python install to get UV. And once you have UV, you can just go!
I still genuinely do not understand why this is a serious selling point. Linux systems commonly already provide (and heavily depend upon) a Python distribution which is perfectly suitable for creating virtual environments, and Python on Windows is provided by a traditional installer following the usual idioms for Windows end users. (To install uv on Windows I would be expected to use the PowerShell equivalent of a curl | sh trick; many people trying to learn to use Python on Windows have to be taught what cmd.exe is, never mind PowerShell.) If anything, new Python-on-Windows users are getting tripped up by the moving target of attempts to make it even easier (in part because of things Microsoft messed up when trying to coordinate with the CPython team; see for example https://stackoverflow.com/questions/58754860/cmd-opens-windo... when it originally happened in Python 3.7).
> If I had a dollar for every time I've helped somebody untangle the mess of python environment libraries created by an undocumented mix of python delivered through the distributions package management versus native pip versus manually installed...
Sure, but that has everything to do with not understanding (or caring about) virtual environments (which are fundamental, and used by uv under the hood because there is really no viable alternative), and nothing to do with getting Python in the first place. I also don't know what you mean about "native pip" here; it seems like you're conflating the Python installation process with the package installation process.
So basically, it avoids the whole chicken-and-egg problem. With UV you've simply always got "UV -> project Python 1.23 -> project". UV is your dependency manager, and your Python is just another dependency.
With other dependency managers you end up with "system Python 3.45 -> dep manager -> project Python 1.23 -> project". Or worse, "system Python 1.23 -> dep manager -> project Python 1.23 -> project". And of course there will be people who read about the problem and install their own Python manager, so they end up with a "system Python -> virtualenv Python -> poetry Python -> project" stack. Or the other way around, and they'll end up installing their project dependencies globally...
Note that the advantages of Rust are not just execution speed: it's also a good language for expressing one's thoughts, and thus makes it easier to find and unlock the algorithmic speedups that really increase speed.
But yeah. Python packaging has been dumb for decades and successive Python package managers recapitulated the same idiocies over and over. Anyone who had used both Python and a serious programming language knew it, the problem was getting anyone to do anything about it. I can't help thinking that maybe the main reason using Rust worked is that it forced anyone who wanted to contribute to it to experience what using a language with a non-awful package manager is like.
Cargo is not really good. The very much non-zero frequency of something with cargo not working for opaque reasons and then suddenly working again after "cargo clean", the "no, I invoke your binaries"-mentality (try running a benchmark without either ^C'ing out of bench to copy the binary name or parsing some internal JSON metadata) because "cargo build" is the only build system in the world which will never tell you what it built, the whole mess with features, default-features, no-default-features, of course bindgen/sys dependency conflicts, "I'll just use the wrong -L libpath for the bin crate but if I'm building tests I remember the ...64". cargo randomly deciding that it now has to rebuild everything or 50% of everything for reasons which are never to be known, builds being not reproducible, cargo just never cleaning garbage up and so on.
rustdoc has only slightly changed since the 2010s, it's still very hard to figure out generic/trait-oriented APIs, and it still only does API documentation in mostly the same basic 1:1 "list of items" style. Most projects end up with two totally disjointed sets of documentation, usually one somewhere on github pages and the rustdoc.
Rust is overall good language, don't get me wrong. But it and the ecosystem also has a ton of issues (and that's without even mentioning async), and most of these have been sticking around since basically 1.0.
(However, the rules around initialization are just stupid and unsafe is no good. Rust also tends to favor a very allocation-heavy style of writing code, because avoiding allocations tends to be possible but often annoying and difficult in unique-to-rust ways. For somewhat related reasons, trivial things are at times really hard in Rust for no discernible reason. As a concrete, simplistic but also real-world example, Vec::push is an incredibly pessimistic method, but if you want to get around it, you either have to initialize the whole Vec, which is a complete waste of cycles, or you yolo it with reserve+set_len, which is invalid Rust because you didn't properly use MaybeUninit for locations which are only ever written.)
It just has to do with values. If you value perf you aren't going to write it in Python. And if you value perf then everything else becomes a no brainer as well.
It's the same way in JS land. You can make a game in a few kilobytes, but most web pages are still many megabytes for what should have been no JS at all.
> the conversation here keeps collapsing back to "Rust rewrite good/bad." That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
I think there's a few things going on here:
- If you're going have a project that's obsessed with speed, you might as well use rust/c/c++/zig/etc to develop the project, otherwise you're always going to have python and the python ecosystem as a speed bottleneck. rust/c/c++/zig ecosystems generally care a lot about speed, so you can use a library and know that it's probably going to be fast.
- For example, the entire python ecosystem generally does not put much emphasis on startup time. I know there's been some recent work here on the interpreter itself, but even modules in the standard library will pre-compile regular expressions at import time, even if they're never used, like the "email" module.
- Because the python ecosystem doesn't generally optimize for speed (especially startup), the slowdowns end up being contagious. If you import a library that doesn't care about startup time, why should your library care about startup time? The same could maybe be said for memory usage.
- The bootstrapping problem is also mostly solved by using a complied language like c/rust/go. If the package manager is written in python (or even node/javascript), you first have to have python+dependencies installed before you can install python and your dependencies. With uv, you copy/install a single binary file which can then install python + dependencies and automatically do the right thing.
- I think it's possible to write a pretty fast implementation using python, but you'd need to "greenfield" it by rewriting all of the dependencies yourself so you can optimize startup time and bootstrapping.
- Also, as the article mentions there are _some_ improvements that have happened in the standards/PEPs that should eventually make they're way into pip, though it probably won't be quite the gamechanger that uv is.
> the entire python ecosystem generally does not put much emphasis on startup time.
You'd think PyPy would be more popular, then.
> even modules in the standard library will pre-compile regular expressions at import time, even if they're never used, like the "email" module.
Hmm, that is slower than I realized (although still just a fraction of typical module import time):
$ python -m timeit --setup 'import re' 're.compile("foo.*bar"); re.purge()'
10000 loops, best of 5: 26.5 usec per loop
$ python -m timeit --setup 'import sys' 'import re; del sys.modules["re"]'
500 loops, best of 5: 428 usec per loop
I agree the email module is atrocious in general, which specifically matters because it's used by pip for parsing "compiled" metadata (PKG-INFO in sdists, when present, and METADATA in wheels). The format is intended to look like email headers and be parseable that way; but the RFC mandates all kinds of things that are irrelevant to package metadata, and despite the streaming interface it's hard to actually parse only the things you really need to know.
> Because the python ecosystem doesn't generally optimize for speed (especially startup), the slowdowns end up being contagious. If you import a library that doesn't care about startup time, why should your library care about startup time? The same could maybe be said for memory usage.
I'm trying to fight this, by raising awareness and by choosing my dependencies carefully.
> you first have to have python+dependencies installed before you can install python and your dependencies
It's unusual that you actually need to install Python again after initially having "python+dependencies installed". And pip vendors all its own dependencies except for what's in the standard library. (Which is highly relevant to Debian getting away with the repackaging that it does.)
> I think it's possible to write a pretty fast implementation using python, but you'd need to "greenfield" it by rewriting all of the dependencies yourself so you can optimize startup time and bootstrapping.
This is my current main project btw. (No, I don't really care that uv already exists. I'll have to blog about why.)
> there are _some_ improvements that have happened in the standards/PEPs that should eventually make they're way into pip
Most of them already have, along with other changes. The 2025 pip experience is, believe it or not, much better than the ~2018 pip experience, notwithstanding higher expectations for ecosystem complexity.
> That feels like cargo-culting the toolchain [...]
Pun intended?
Jokes aside, what you describe is a common pattern. It's also why Google internally they used to get decent speedups from rewriting some old C++ project in Go for a while: the magic was mostly in the rewrite-with-hindsight.
If you put effort into it, you can also get there via an incremental refactoring of an existing system. But the rewrite is probably easier to find motivation for, I guess.
I can't find the quote for this, but I remember Python maintainers wanted package installing and management to be separate things. uv did the opposite, and instead it's more like npm.
Because it broke backwards compatibility? It's worth noting that setuptools is in a similar situation to pip, where any change has a high chance of breaking things (as can be seen by perusing the setuptools and pip bug trackers). PEP 517/518 removed the implementation-defined nature of the ecosystem (which had caused issues for at least a decade, see e.g. the failures of distutils2 and bento), instead replacing it with a system where users complain about which backend to use (which is at least an improvement on the previous situation)...
I suspect that the non-Rust improvements are vastly more important than you’re giving credit for. I think the go version would be 5x or 8x compared to the 10x, maybe closer. It’s not that the Rust parts are insignificant but the algorithmic changes eliminate huge bottlenecks.
> That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
This feels like a very unfair take to me. Uv didn’t happen in isolation, and wasn’t the first alternative to pip. It’s built on a lot of hard work by the community to put the standards in place, through the PEP process, that make it possible.
Poetry largely accomplished the same thing first with most of the speedups (except managing your python installations) and had the disadvantage of starting before the PEPs you mentioned were standardized.
I don't know the problem space and I'm sure that the language-agnostic algorithmic improvements are massive. But to me, there's just something about rust that promotes fast code. It's easy to avoid copies and pointer-chasing, for example. In python, you never have any idea when you're copying, when you're chasing a pointer, when you're allocating, and so on. (Or maybe you do, but I certainly don't.) You're so far from hardware that you start thinking more abstractly and not worrying about performance. For some things, that's probably perfect. But for writing fast code, it's not the right mindset.
The thing is that a lot of the bottlenecks in pip are entirely artificial, and a lot of the rest can't really be improved by rewriting in Rust per se, because they're already written in C (within the Python interpreter itself).
> it's how much speed we "unlocked" just by finally treating Python packaging as a well-specified systems problem instead of a pile of historical accidents.
A lot of that, in turn, boils down to realizing that it could be fast, and then expecting that and caring enough about it.
> but with the same design decisions (PEP 517/518/621/658 focus, HTTP range tricks, aggressive wheel-first strategy, ignoring obviously defensive upper bounds, etc.), I strongly suspect we'd be debating a 1.3× vs 1.5× speedup instead of a 10× headline
I'm doing a project of this sort (although I'm hoping not to reinvent the wheel (heh) for the actual resolution algorithm). I fully expect that some things will be barely improved or even slower, but many things will be nearly as fast as with uv.
For example, installing from cache (the focus for the first round) mainly relies on tools in the standard library that are written in C and have to make system calls and interact with the filesystem; Rust can't do a whole lot to improve on that. On the other hand, a new project can improve by storing unpacked files in the cache (like uv) instead of just the artifact (I'm storing both; pip stores the artifact, but with a msgpack header) and hard-linking them instead of copying them (so that the system calls do less I/O). It can also improve by actually making the cached data accessible without a network call (pip's cache is an HTTP cache; contacting PyPI tells it what the original download URL is for the file it downloaded, which is then hashed to determine its path).
For another example, pre-compiling bytecode can be parallelized; there's even already code in the standard library for it. Pip hasn't been taking advantage of that all this time, but to my understanding it will soon feature its own logic (like uv does) to assign files to compile to worker processes. But Rust can't really help with the actual logic being parallelized, because that, too, is written purely in C (at least for CPython), within the interpreter.
> why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
(Zeroth, pip has been doing HTTP range tricks, or at least trying, for quite a while. And the exact point of PEP 658 is to obsolete them. It just doesn't really work for sdists with the current level of metadata expressive power, as in other PEPs like 440 and 508. Which is why we have more PEPs in the pipeline trying to fix that, like 725. And discussions and summaries like https://pypackaging-native.github.io/.)
First, you have to write the standards. People in the community expect interoperability. PEP 518 exists specifically so that people could start working on alternatives to Setuptools as a build backend, and PEP 517 exists so that such alternatives could have the option of providing just the build backend functionality. (But the people making things like Poetry and Hatch had grander ideas anyway.)
But also, consider the alternative: the only other viable way would have been for pip to totally rip apart established code paths and possibly break compatibility. And, well, if you used and talked about Python at any point between 2006 and 2020, you should have the first-hand experience required to complete that thought.
That's TensorRT-LLM in it's entirety at 1.2.0rc6 locked to run on Ubuntu or NixOS with full MPI and `nvshmem`, the DGX container Jensen's Desk edition (I know because I also rip apart and `autopatchelf` NGC containers for repackaging on Grace/SBSA).
It's... arduous. And the benefit is what exactly? A very mixed collection of maintainers have asserted that software behavior is monotonic along a single axis most of which they can't see and we ran a solver over those guesses?
I think the future is collections of wheels that have been through a process the consumer regards as credible.
I think this post does a really good job of covering how multi-pronged performance is: it certainly doesn't hurt uv to be written in Rust, but it benefits immensely from a decade of thoughtful standardization efforts in Python that lifted the ecosystem away from needing `setup.py` on the hot path for most packages.
Someone once told me a benefit of staffing a project for Haskell was it made it easy to select for the types of programmers that went out of their way to become experts in Haskell.
Tapping the Rust community is a decent reason to do a project in Rust.
It's an interesting debate. The flip side of this coin is getting hires who are more interested in the language or approach than the problem space and tend to either burn out, actively dislike the work at hand, or create problems that don't exist in order to use the language to solve them.
With that said, Rust was a good language for this in my experience. Like any "interesting" thing, there was a moderate bit of language-nerd side quest thrown in, but overall, a good selection metric. I do think it's one of the best Rewrite it in X languages available today due to the availability of good developers with Rewrite in Rust project experience.
The Haskell commentary is curious to me. I've used Haskell professionally but never tried to hire for it. With that said, the other FP-heavy languages that were popular ~2010-2015 were absolutely horrible for this in my experience. I generally subscribe to a vague notion that "skill in a more esoteric programming language will usually indicate a combination of ability to learn/plasticity and interest in the trade," however, using this concept, I had really bad experiences hiring both Scala and Clojure engineers; there was _way_ too much academic interest in language concepts and way too little practical interest in doing work. YMMV :)
Paul Graham said the same thing about Python 20 years ago [1], and back then it was true. But once a programming langauge hits mainstream, this ceases to be a good filter.
I'm my experience this is definitely where rust shined. The language wasn't really what made the project succeed so much as having relatively curious, meticulous, detail-oriented people on hand who were interested in solving hard problems.
Sometimes I thought our teams would be a terrible fit for more cookie-cutter applications where rapid development and deployment was the primary objective. We got into the weeds all the time (sometimes because of rust itself), but it happened to be important to do so.
Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.
I think a lot of rust rewrites have this benefit; if you start with hindsight you can do better more easily. Of course, rust is also often beneficial for its own sake, so it's a one-two punch:)
> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.
Isn't assigning out what all made things fast presumptive without benchmarks? Yes, I imagine a lot is gained by the work of those PEPs. I'm more questioning how much weight is put on dropping of compatibility compared to the other items. There is also no coverage for decisions influenced by language choice which likely influences "Optimizations that don’t need Rust".
This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
> Isn't assigning out what all made things fast presumptive without benchmarks?
We also have the benchmark of "pip now vs. pip years ago". That has to be controlled for pip version and Python version, but the former hasn't seen a lot of changes that are relevant for most cases, as far as I can tell.
> This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML. A package installer really only needs to look at it at all when building from source, and part of what these standards have done for us is improve wheel coverage. (Other relevant PEPs here include 600 and its predecessors.) Although that has also largely been driven by education within the community, things like e.g. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... and https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-... .
> This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML.
I don't know the details of Python's resolution algorithm, but for Cargo (which is where epage is coming from) a lockfile (which is encoded in TOML) can be somewhat large-ish, maybe pushing 100 kilobytes (to the point where I'm curious if epage has benchmarked to see if lockfile parsing is noticeable in the flamegraph).
To be fair, the whole post isn't very good IMO, regardless of ChatGPT involvement, and it's weird how some people seem to treat it like some kind of revelation.
I mean, of course it wasn't specifically Rust that made it fast, it's really a banal statement: you need only very moderate serious programming experience to know, that rewriting legacy system from scratch can make it faster even if you rewrite it in a "slower" language. There have been C++ systems that became faster when rewritten in Python, for god's sake. That's what makes system a "legacy" system: it does a ton of things and nobody really knows what it does anymore.
But when listing things that made uv faster it really mentions some silly things, among others. Like, it doesn't parse pip.conf. Right, sure, the secret of uv's speed lies in not-parsing other package manager's config files. Great.
So all in all, yes, no doubt that hundreds of little things contributed into making uv faster, but listing a few dozens of them (surely a non-exhaustive lists) doesn't really enable you to make any conclusions about the relative importance of different improvements whatsoever. I suppose the mentioned talk[0] (even though it's more than a year old now) would serve as a better technical report.
The content is nice and insightful! But God I wish people stopped using LLMs to 'improve' their prose... Ironically, some day we might employ LLMs to re-humanize texts that had been already massacred.
The author’ blog was on HN a few days ago as well for an article on SBOMs and Lockfiles. They’ve done a lot of work in the supply-chain security side and are clearly knowledgeable, and yet the blog post got similarly “fuzzified” by the LLM.
To me, unless it is egregious, I would be very sensitive to avoid false positives before saying something is LLM aided. If it is clearly just slop, then okay, but I definitely think there is going to be a point where people claim well-written, straightforward posts as LLM aided. (Or even the opposite, which already happens, where people purposely put errors in prose to seem genuine).
there is going to be a point where people have read so much slop that they will start regurgitating the same style without even realising it. or we could already be at that point
I have reached a point where any AI smell (of which this articles has many) makes me want to exit immediately. It feels tortuous to my reading sensibilities.
I blame fixed AI system prompts - they forcibly collapse all inputs into the same output space. Truly disappointing that OpenAI et all have no desire to change this before everything on the internet sounds the same forever.
You're probably right about the latter point, but I do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.
As you said, reading this stuff is taxing. What's more, this is a daily occurrence by now. If there's a silver lining, it's that the LLM smells are so obvious at the moment; I can close the tab as soon as I notice one.
> Ironically, some day we might employ LLMs to re-humanize texts
I heard high school and college students are doing this routinely so their papers don't get flagged as AI
this is whether they used an LLM for the whole assignment or wrote it themselves, has to get pass through a "re-humanizing" LLM either way just to avoid drama
> Zero-copy deserialization. uv uses rkyv to deserialize cached data without copying it. The data format is the in-memory format. This is a Rust-specific technique.
This (zero-copy deserialization) is not a rust-specific technique, so I'm not entirely sure why the author describes it as one. Any good low level language (C/C++ included) can do this from my experience.
I think the framing in the post is that it's specific to Rust, relative to what Python packaging tools are otherwise written in (Python). It's not very easy to do zero-copy deserialization in pure Python, from experience.
(But also, I think Rust can fairly claim that it's made zero-copy deserialization a lot easier and safer.)
I suppose it can fairly claim that now every other library and blog post invokes "zero-copy" this and that, even in the most nonsensical scenarios. It's a technique for when you can literally not afford the memory bandwidth, because you are trying to saturate a 100Gbps NIC or handling 8k 60Hz video, not for compromising your data serialization schemes portability for marketing purposes while all applications hit the network first, disk second and memory bandwidth never.
I can't even imagine what "safety" issue you have in mind. Given that "zero-copy" apparently means "in-memory" (a deserialized version of the data necessarily cannot be the same object as the original data), that's not even difficult to do with the Python standard library. For example, `zipfile.ZipFile` has a convenience method to write to file, but writing to in-memory data is as easy as
with zipfile.ZipFile(archive_name) as a:
with a.open(file_name) as f, io.BytesIO() as b:
b.write(f.read())
return b.getvalue()
(That does, of course, copy data around within memory, but.)
> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence.
pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.
For example, my employer (Datadog) allowed me and two other engineers to improve various aspects of Python packaging for nearly an entire quarter. One of the items was to satisfy a few long-standing pip feature requests. I discovered that the cross-platform resolution feature I considered most important is basically incompatible [1] with the current code base. Maintainers would have to decide which path they prefer.
> pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.
Backwards compatibility is the one thing that prevents the code in an older project from being replaced with a better approach in situ. It cannot be more difficult than a rewrite, except that rewrites (arguably including my project) may hold themselves free to skip hard legacy cases, at least initially (they might not be relevant by the time other code is ready).
(I would be interested in hearing from you about UX designs for cross-platform resolution, though. Are you just imagining passing command-line flags that describe the desired target environment? What's the use case exactly — just making a .pylock file? It's hard to imagine cross-platform installation....)
My favorite speed up trick: “ HTTP range requests for metadata. Wheel files are zip archives, and zip archives put their file listing at the end. uv tries PEP 658 metadata first, falls back to HTTP range requests for the zip central directory, then full wheel download, then building from source. Each step is slower and riskier. The design makes the fast path cover 99% of cases. None of this requires Rust.”
> Ignoring requires-python upper bounds. When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
Yes, but it's (probably) the least worse thing they can do given how the "PyPI" ecosystem behaves. As PyPI does not allow replacement of artefacts (sdists, wheels, and older formats), and because there is no way to update/correct metadata for the artefacts, unless the uploader knew at upload time of incompatibilities between their package and and the upper-bounded reference (whether that is the Python interpreter or a Python package), the upper bound does not reflect a known incompatibility. In addition, certain tools (e.g. poetry) added the upper bounds automatically, increasing the amount of spurious bounds. https://iscinumpy.dev/post/bound-version-constraints/ provides more details.
The general lesson from this is when you do not allow changes/replacement of invalid data (which is a legitimate thing to do), then you get stuck with handling the bad data in every system which uses it (and then you need to worry about different components handling the badness in different ways, see e.g. browsers).
No. When such upper bounds are respected, they contaminate other packages, because you have to add them yourself to be compatible with your dependencies. Then your dependents must add them too, etc. This brings only pain. Python 4 is not even a thing, core developers say there won't ever be a Python 4.h
> you have to add them yourself to be compatible with your dependencies
This is no more true for version upper bounds than it is for version lower bounds, assuming that package installers ensure all package version constraints are satisfied.
I presume you think version lower bounds should still be honoured?
> PEP 658 went live on PyPI in May 2023. uv launched in February 2024. The timing isn’t coincidental. uv could be fast because the ecosystem finally had the infrastructure to support it. A tool like uv couldn’t have shipped in 2020. The standards weren’t there yet.
How/why did the package maintainers start using all these improvements? Some of them sound like a bunch of work, and getting a package ecosystem to move is hard. Was there motivation to speed up installs across the ecosystem? If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
> If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
It wasn't working okay for many people, and many others haven't started using pyproject.toml.
For what I consider the most egregious example: Requests is one of the most popular libraries, under the PSF's official umbrella, which uses only Python code and thus doesn't even need to be "built" in a meaningful sense. It has a pyproject.toml file as of the last release. But that file isn't specifying the build setup following PEP 517/518/621 standards. That's supposed to appear in the next minor release, but they've only done patch releases this year and the relevant code is not at the head of the repo, even though it already caused problems for them this year. It's been more than a year and a half since the last minor release.
I should have mentioned one of the main reasons setup.py turns out not okay for people (aside from the general unpleasantness of running code to determine what should be, and mostly is, static metadata): in the legacy approach, Setuptools has to get `import`ed from the `setup.py` code before it can run, but running that code is the way to find out the dependencies. Including build-time dependencies. Specifically Setuptools itself. Good luck if the user's installed version is incompatible with what you've written.
I like the implication that we can have an alternative to uv speed-wise, but I think reliability and understandability are more important in this context (so this comment is a bit off-topic).
What I want from a package manager is that it just works.
That's what I mostly like about uv.
Many of the changes that made speed possible were to reduce the complexity and thus the likelihood of things not working.
What I don't like about uv (or pip or many other package managers), is that the programmer isn't given a clear mental model of what's happening and thus how to fix the inevitable problems. Better (pubhub) error messages are good, but it's rare that they can provide specific fixes. So even if you get 99% speed, you end up with 1% perplexity and diagnostic black boxes.
To me the time that matters most is time to fix problems that arise.
> the programmer isn't given a clear mental model of what's happening and thus how to fix the inevitable problems.
This is a priority for PAPER; it's built on a lower-level API so that programmers can work within a clear mental model, and I will be trying my best to communicate well in error messages.
I remain baffled about these posts getting excited about uv’s speed. I’d like to see a real poll but I personally can’t imagine people listing speed as one of the their top ten concerns about python package managers. What are the common use cases where the delay due to package installation is at all material?
At a previous job, I recall updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem. Was not an enjoyable experience.
> updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem
Working heavily in Python for the last 20 years, it absolutely was a big deal. `pip install` has been a significant percentage of the deploy time on pretty much every app I've ever deployed and I've spent countless hours setting up various caching techniques trying to speed it up.
I can run `uvx sometool` without fear because I know that it'll take a few seconds to create a venv, download all the dependencies, and run the tool. uv's speed has literally changed how I work with Python.
`poetry install` on my dayjob’s monolith took about 2 minutes, `uv sync` takes a few seconds. Getting 2 minutes back on every CI job adds up to a lot of time saved
As a multi decade Python user, uv's speed is "life changing". It's a huge devx improvement. We lived with what came before, but now that I have it, I would never want to go back and it's really annoying to work on projects now that aren't using it.
Docker builds are a big one, at least at my company. Any tool that reduces wait time is worth using, and uv is an amazing tool that removes that wait time. I take it you might not use python much as it solves almost every pain point, and is fast which feels rare.
CI: I changed a pipeline at work from pip and pipx to uv, it saved 3 minutes on a 7 minute pipeline. Given how oversubscribed our runners are, anything saving time is a big help.
It is also really nice when working interactivly to have snappy tools that don't take you out of the flow more than absolutely more than necessary. But then I'm quite sensitive to this, I'm one of those people who turn off all GUI animations because they waste my time and make the system feel slow.
It's not just about delays being "material"; waiting on the order of seconds for a venv creation (and knowing that this is because of pip bootstrapping itself, when it should just be able to install cross-environment instead of having to wait until 2022 for an ugly, limited hack to support that) is annoying.
I avoided Python for years, especially because of package and environment management. Python is now my go to for projects since discovering uv, PEP 723 metadata, and LLMs’ ability to write Python.
The speed is nice, but I switched because uv supports "pip compile" from pip-tools, and it is better at resolving dependencies. Also pip-tools uses (used?) internal pip methods and breaks frequently because of that, uv doesn't.
Speed is one of the main reasons why I keep recommending uv to people I work with, and why I initially adopted it: Setting up a venv and installing requirements became so much faster. Replacing pipx and `uv run` for single-file scripts with external dependencies, were additional reasons. With nox adding uv support, it also became much easier and much faster to test across multiple versions of Python
Setting up a new dev instance took 2+ hours with pip at my work. Switching to uv dropped the Python portion down to <1 minute, and the overall setup to 20 minutes.
A similar, but less drastic speedup applied to docker images.
One weird case where this mattered to me, I wanted pip to backtrack to find compatible versions of a set of deps, and it wasn't done after waiting a whole hour. uv did the same thing in 5 minutes. This might be kinda common because of how many Python repos out there don't have pinned versions in dependencies.txt.
There's an interesting psychology at play here as well, if you are a programmer that chooses a "fast language" it's indicative of your priorities already, it's often not much the language, but that the programmer has decided to optimize for performance from the get go.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.
Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)
No, because Python itself will generate bytecode for packages once you actually import them. uv just defers that to first-import time, but the cost is amortized in any setting where imports are performed over multiple executions.
That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.
My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.
My Docker build generating the byte code saves it to the image, sharing the cost at build time across all image deployments — whereas, building at first execution means that each deployed image instance has to generate its own bytecode!
That’s a massive amplification, on the order of 10-100x.
“Well just tell it to generate bytecode!”
Sure — but when is the default supposed to be better?
Because this sounds like a massive footgun for a system where requests >> deploys >> builds. That is, every service I’ve written in Python for the last decade.
Yes, uv skipping this step is a one time significant hit to start up time. E.g. if you're building a Dockerfile I'd recommend setting `--compile-bytecode` / `UV_COMPILE_BYTECODE`
Historically the practice of producing pyc files on install started with system wide installed packages, I believe, when the user running the program might lack privileges to write them.
If the installer can write the .oy files it can also write the .pyc, while the user running them might not in that location.
This optimization hits serverless Python the worst. At Modal we ensure users of uv are setting UV_COMPILE_BYTECODE to avoid the cold start penalty. For large projects .pyc compilation can take hundreds of milliseconds.
> I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?
They do.
> Are we losing out on performance of the actual installed thing, then?
When you consciously precompile Python source files, you can parallelize that process. When you `import` from a `.py` file, you only get that benefit if you somehow coincidentally were already set up for `multiprocessing` and happened to have your workers trying to `import` different files at the same time.
If you have a dependency graph large enough for this to be relevant, it almost certainly includes a large number of files which are never actually imported. At worst the hit to startup time will be equal to the install time saved, and in most cases it'll be a lot smaller.
> a large number of files which are never actually imported
Unfortunately, it typically doesn't work out as well as you might expect, especially given the expectation of putting `import` statements at the top of the file.
> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
This is kind of fascinating. I've never considered runtime upper bound requirements. I can think of compelling reasons for lower bounds (dropping version support) or exact runtime version requirements (each version works for exact, specific CPython versions). But now that I think about it, it seems like upper bounds solve a hypothetical problem that you'd never run into in practice.
If PSF announced v4 and declared a set of specific changes, I think this would be reasonable. In the 2/3 era it was definitely reasonable (even necessary). Today though, it doesn't actually save you any trouble.
I think the article is being careful not to say uv ignores _all_ upper bound checks, but specifically 4.0 upper bound checks. If a package says it requires python < 3.0, that's still super relevant, and I'd hope for uv to still notice and prevent you from trying to import code that won't work on python 3. Not sure what it actually does.
I read the article as saying it ignores all upper-bounds, and 4.0 is just an example. I could be wrong though - it seems ambiguous to me.
But if we accept that it currently ignores any upper-bounds checks greater than v3, that's interesting. Does that imply that once Python 4 is available, uv will slow down due to needing to actually run those checks?
> This reduces resolver backtracking dramatically since upper bounds are almost always wrong.
I am surprised by this because Python minor versions break backwards compatibility all the time. Our company for example is doing a painful upgrade from py39 to py311
Could you explain what major pain points you've encountered? I can't think of any common breakages cited in 3.10 or 3.11 offhand. 3.12 had a lot more standard library removals, and the `match` statement introduced in 3.10 uses a soft keyword and won't break code that uses `match` as an identifier.
At Plotly we did a decent amount of benchmarking to see how much the different defaults `uv` uses lead to its performance. This was necessary so we could advise our enterprise customers on the transition. We found you lost almost all of the speed gains if you configured uv behave as much like pip as you could. A trivial example is the precompile flag, which can easily be 50% of pips install time for a typical data science venv.
The precompilation thing was brought up to the uv team several months ago IIRC. It doesn't make as much of a difference for uv as for pip, because when uv is told to pre-compile it can parallelize that process. This is easily done in Python (the standard library even provides rudimentary support, which Python's own Makefile uses); it just isn't in pip yet (I understand it will be soon).
This post is excellent. I really like reading deep dives like this that take a complex system like uv and highlight the unique design decisions that make it work so well.
I also appreciate how much credit this gives the many previous years of Python standards processes that enabled it.
I have to say it's just lovely seeing such a nicely crafted and written technical essay. It's so obvious that this is crafted by hand, and reading it just reemphasises how much we've lost because technical bloggers are too ready to hand the keys over to LLMs.
> Every code path you don’t have is a code path you don’t wait for.
No, every code path you don't execute is that. Like
> No .egg support.
How does that explain anything if the egg format is obsolete and not used?
Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.
And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!
But otherwise great and seemingly comprehensive list!
Even in compiled languages, binaries have to get loaded into memory. For Python it's much worse. On my machine:
$ time python -c 'pass'
real 0m0.019s
user 0m0.013s
sys 0m0.006s
$ time pip --version > /dev/null
real 0m0.202s
user 0m0.182s
sys 0m0.021s
Almost all of that extra time is either the module import process or garbage collection at the end. Even with cached bytecode, the former requires finding and reading from literally hundreds of files, deserializing via `marshal.loads` and then running top-level code, which includes creating objects to represent the functions and classes.
It used to be even worse than this; in recent versions, imports related to Requests are deferred to the first time that an HTTPS request is needed.
So... will uv make Python a viable cross-platform utility solution?
I was going to learn Python for just that (file-conversion utilities and the like), but everybody was so down on the messy ecosystem that I never bothered.
Yes, uv basically solves the terrible Python tooling situation.
In my view that was by far the biggest issue with Python - a complete deal-breaker really. But uv solves it pretty well.
The remaining big issues are a) performance, and b) the import system. uv doesn't do anything about those.
Performance may not be an issue in some cases, and the import system is ... tolerable if you're writing "a python project". If you're writing some other project and considering using Python for its scripting system, e.g. to wrangle multiple build systems or whatever than the import mess is a bigger issue and I would thing long and hard before picking it over Deno.
Thanks! I don't really think about importing stuff (which maybe I should), because I assume I'll have to write any specialized logic myself. So... your outlook is encouraging.
I've talked about this many times on HN this year but got beaten to the punch on blogging it seems. Curses.
... Okay, after a brief look, there's still lots of room for me to comment. In particular:
> pip’s slowness isn’t a failure of implementation. For years, Python packaging required executing code to find out what a package needed.
This is largely refuted by the fact that pip is still slow, even when installing from wheels (and getting PEP 600 metadata for them). Pip is actually still slow even when doing nothing. (And when you create a venv and allow pip to be bootstrapped in it, that bootstrap process takes in the high 90s percent of the total time used.)
The article info is great, but why do people put up with LLM ticks and slop in their writing? These sentences add no value and treats the reader as stupid.
> This is concurrency, not language magic.
> This is filesystem ops, not language-dependent.
Duh, you literally told me that the previous sentence and 50 million other times.
This kind of writing goes deeper than LLM's, and reflects a decline in both reading ability, patience, and attention. Without passing judgement, there are just more people now who benefit from repetition and summarization embedded directly in the article. The reader isn't 'stupid', just burdened.
Indeed, I am coming around in the past few weeks to realization and acceptance that the LLM editorial voice is a benefit to an order of magnitude more hn readers than those (like us) for whom it is ice pick in the nostril stuff.
Why? They are pretty compatible. Just set the venv in the project's mise.toml are you are good to go. Mise will activate it automatically when you change into the project directory.
If it's really not doing any upper bound checks, I could see it blowing up under more mundane conditions; Python includes breaking changes on .x releases, so I've had eg. packages require (say) Python 3.10 when 3.11/12 was current.
As for 20 ms, if you deal with 20 dependencies in parallel, that's 400ms just to start working.
Shaving half a second on many things make things fast.
Althought as we saw with zeeek in the other comment, you likely don't need multiprocessing since the network stack and unzip in the stdlib release the gil.
Threads are cheaper.
Maybe if you'd bundle pubgrub as a compiled extension, you coukd get pretty close to uv's perf.
parallel downloads don't need multi-processing since this is an IO bound usecase. asyncio or GIL-threads (which unblock on IO) would be perfectly fine. native threads will eventually be the default also.
Indeed, but unzipping while downloading do. Analysing multiple metadata files and exporting lock data as well.
Now I believe unzip releases the GIL already so we could already benefit from that and the rest likely don't dominate perfs.
But still, rust software is faster on average than python software.
After all, all those things are possible in python, and yet we haven't seen them all in one package manager before uv.
Maybe the strongest advantage of rust, on top of very clean and fast default behaviors, is that it attracts people that care about speed, safety and correctness. And those devs are more likely to spend time implementing fast software.
Thought the main benefit of uv is not that it's fast. It's very nice, and opens more use cases, but it's not the killer feature.
The killer feature is, being a stand alone executable, it bypasses all python bootstrapping problems.
Again, that could technically be achieved in python, but friction is a strong force.
> Some of uv’s speed comes from Rust. But not as much as you’d think. Several key optimizations could be implemented in pip today: […] Python-free resolution
I guess you mean doing the things in Python that are supposedly doable from Python.
Yeah, to a zeroth approximation that's my current main project (https://github.com/zahlman/paper). Of course, I'm just some rando with apparently serious issues convincing myself to put in regular unpaid work on it, but I can see in broad strokes how everything is going to work. (I'm not sure I would have thought about, for example, hard-linking files when installing them from cache, without uv existing.)
> Not many projects use setup.py now anyway and pip is still super slow.
Yes, but that's still largely not because of being written in Python. The architecture is really just that bad. Any run of pip that touches the network will end up importing more than 500 modules and a lot of that code will simply not be used.
For example, one of the major dependencies is Rich, which includes things like a 3600-entry mapping of string names to emoji; Rich in turn depends on Pygments which normally includes a bunch of rules for syntax highlighting in dozens of programming languages (but this year they've finished trimming those parts of the vendored Pygments).
Another thing is that pip's cache is an HTTP cache. It literally doesn't know how to access its own package download cache without hitting the network, and it does that access through wrappers that rely on cachecontrol and Requests.
Mine either. Choosing Rust by no means guarantees your tool will be fast—you can of course still screw it up with poor algorithms. But I think most people who choose Rust do so in part because they aspire for their tool to be "blazing fast". Memory safety is a big factor of course, but if you didn't care about performance, you might have gotten that via a GCed (and likely also interpreted or JITed or at least non-LLVM-backend) language.
Yeah sometimes you get surprisingly fast Python programs or surprisingly slow Rust programs, but if you put in a normal amount of effort then in the vast majority of cases Rust is going to be 10-200x faster.
I actually rewrote a non-trivial Python program in Rust once because it was so slow (among other reasons), and got a 50x speedup. It was mostly just running regexes over logs too, which is the sort of thing Python people say is an ideal case (because it's mostly IO or implemented in C).
very nice article, always good to get a review of what a "simple" looking tool does behind the scense
about rust though
some say a nicer language helps finding the right architecture (heard that about cpp veteran dropping it for ocaml, any attempted idea would take weeks in cpp, was a few days in ocaml, they could explore more)
also the parallelism might be a benefit the language orientation
> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence. But it means pip will always be slower than a tool that starts fresh with modern assumptions.
what does backwards compatibility have to do with parallel downloads? or global caching? The metadata-only resolution is the only backwards compatible issue in there and pip can run without a setup.py file being present if pyproject.toml is there.
Short answer is most, or at least a whole lot, of the improvements in uv could be integrated into pip as well (especially parallelizing downloads). But they're not, because there is uv instead, which is also maintained by a for-profit startup. so pip is the loser
I too use pipenv unless there's a reason not to. I hope people use whatever works best for them.
I feel that sometimes there's a desire on the part of those who use tool X that everyone should use tool X. For some types of technology (car seat belts, antibiotics...) that might be reasonable but otherwise it seems more like a desire for validation of the advocate's own choice.
My biggest complaint with pipenv is/was(?) that it's lockfile format only kept the platform identifiers of the platform you locked it on - so if you created it on Mac, then tried to install from the lockfile on a Linux box, you're building from source because it's only locked in wheels for MacOS.
Came here to ask about pipenv. As someone who does not use python other than for scripting, but also appreciates the reproduceability that pipenv provides, should I be using uv? My understanding is that pipenv is the better successor to venv and pip (combined), but now everyone is talking about uv so to be honest it's quite confusing.
Edit: to add to what my understanding of pipenv is, the "standard/approved" method of package management by the python community, but in practice is it not? Is it now uv?
this shit is ChatGPT-written and I'm really tired of it. If I wanted to read chatgpt I would have asked it myself. Half of the article are nonsensical repeated buzzwords thrown in for absolutely no reason
This is great to read because it validates my impression that Python packaging has always been a tremendous overengineered mess. Glad to see someone finally realized you just need a simple standard metadata file per package.
It has been realized in the Python community for a very long time. But there have been years of debate over the contents and formatting, and years of trying to figure out how to convince authors and maintainers to do the necessary work on their end, and years of trying to make sure the ecosystem doesn't explode from trying to remove legacy support.
There are still separate forms of metadata for source packages and pre-compiled distributions. This is necessary because of all the weird idiosyncratic conditional logic that might be necessary in the metadata for platform-specific dependencies. Some projects are reduced to figuring out the final metadata at build time, while building on the user's machine, because that's the only way to find out enough about the user's machine to make everything work.
It really isn't as straightforward as you'd expect, largely because Python code commonly interfaces to compiled code in several different languages, and end users expect this to "just work", including on Windows where they don't have a compiler and might not know what that is.
The most surprising part of uv's success to me isn't Rust at all, it's how much speed we "unlocked" just by finally treating Python packaging as a well-specified systems problem instead of a pile of historical accidents. If uv had been written in Go or even highly optimized CPython, but with the same design decisions (PEP 517/518/621/658 focus, HTTP range tricks, aggressive wheel-first strategy, ignoring obviously defensive upper bounds, etc.), I strongly suspect we'd be debating a 1.3× vs 1.5× speedup instead of a 10× headline — but the conversation here keeps collapsing back to "Rust rewrite good/bad." That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
It's not just greenfield-ness but the fact it's a commercial endeavor (even if the code is open-source).
Building a commercial product means you pay money (or something they equally value) to people to do your bidding. You don't have to worry about politics, licensing, and all the usual FOSS-related drama. You pay them to set their opinions aside and build what you want, not what they want (and if that doesn't work, it just means you need to offer more money).
In this case it's a company that believes they can make a "good" package manager they can sell/monetize somehow and so built that "good" package manager. Turns out it's at least good enough that other people now like it too.
This would never work in a FOSS world because the project will be stuck in endless planning as everyone will have an opinion on how it should be done and nothing will actually get done.
Similar story with systemd - all the bitching you hear about it (to this day!) is the stuff that would've happened during its development phase had it been developed as a typical FOSS project and ultimately made it go nowhere - but instead it's one guy that just did what he wanted and shared it with the world, and enough other people liked it and started building upon it.
I don't know what you think "typical Foss projects" are but in my experience they are exactly like your systemd example: one person that does what they want and share it with the world. The rest of your argument doesn't really make any sense with that in mind.
6 replies →
> You don't have to worry about politics, licensing, and all the usual FOSS-related drama. You pay them to set their opinions aside and build what you want, not what they want (and if that doesn't work, it just means you need to offer more money).
Money is indeed a great lubricator.
However, it's not black-and-white: office politics is a long standing term for a reason.
16 replies →
Sounds like you’re really down on FOSS and think FOSS projects don’t get stuff done and have no success? You might want to think about that a bit more.
1 reply →
it wouldn't work in a foss world because there's like 5 guys doing that shit it in their spare time. that said... github...
That doesn't make any sense. You can do open source by yourself and not accept any input.
How's the company behind uv making money?
1 reply →
Is there any sign telling Astral is actually making money via uv? How sustainable is it?
I suggest everyone save this comment and review it five years later.
3 replies →
Why doesn't anaconda disprove this?
nah, a lot of people working on `uv` have a massive amount of experience working on the rust ecosystem, including `cargo` the rust package manager. `uv` is even advertised as `cargo` for python. And what is `cargo`? a FLOSS project.
Lots of lessons from other FLOSS package managers helped `cargo` become great, and then this knowledge helped shape `uv`.
I 100% agree with this
And it's true, while I disagree with a lot of systemd decisions focus has a leveraging effect that's disproportional
IIRC correctly uv was started before Astral (the company working on uv)
numpy would like a word
2 replies →
I largely agree but don't want to entirely discount the effect that using a compiled language had.
At least in my limited experience, the selling point with the most traction is that you don't already need a working python install to get UV. And once you have UV, you can just go!
If I had a dollar for every time I've helped somebody untangle the mess of python environment libraries created by an undocumented mix of python delivered through the distributions package management versus native pip versus manually installed...
At least on paper, both poetry and UV have a pretty similar feature set. You do however need a working python environment to install and use poetry though.
> the selling point with the most traction is that you don't already need a working python install to get UV. And once you have UV, you can just go!
I still genuinely do not understand why this is a serious selling point. Linux systems commonly already provide (and heavily depend upon) a Python distribution which is perfectly suitable for creating virtual environments, and Python on Windows is provided by a traditional installer following the usual idioms for Windows end users. (To install uv on Windows I would be expected to use the PowerShell equivalent of a curl | sh trick; many people trying to learn to use Python on Windows have to be taught what cmd.exe is, never mind PowerShell.) If anything, new Python-on-Windows users are getting tripped up by the moving target of attempts to make it even easier (in part because of things Microsoft messed up when trying to coordinate with the CPython team; see for example https://stackoverflow.com/questions/58754860/cmd-opens-windo... when it originally happened in Python 3.7).
> If I had a dollar for every time I've helped somebody untangle the mess of python environment libraries created by an undocumented mix of python delivered through the distributions package management versus native pip versus manually installed...
Sure, but that has everything to do with not understanding (or caring about) virtual environments (which are fundamental, and used by uv under the hood because there is really no viable alternative), and nothing to do with getting Python in the first place. I also don't know what you mean about "native pip" here; it seems like you're conflating the Python installation process with the package installation process.
7 replies →
So basically, it avoids the whole chicken-and-egg problem. With UV you've simply always got "UV -> project Python 1.23 -> project". UV is your dependency manager, and your Python is just another dependency.
With other dependency managers you end up with "system Python 3.45 -> dep manager -> project Python 1.23 -> project". Or worse, "system Python 1.23 -> dep manager -> project Python 1.23 -> project". And of course there will be people who read about the problem and install their own Python manager, so they end up with a "system Python -> virtualenv Python -> poetry Python -> project" stack. Or the other way around, and they'll end up installing their project dependencies globally...
8 replies →
1000% this. uv is trivially installable and is completely unrelated to installations of python.
9 replies →
Note that the advantages of Rust are not just execution speed: it's also a good language for expressing one's thoughts, and thus makes it easier to find and unlock the algorithmic speedups that really increase speed.
But yeah. Python packaging has been dumb for decades and successive Python package managers recapitulated the same idiocies over and over. Anyone who had used both Python and a serious programming language knew it, the problem was getting anyone to do anything about it. I can't help thinking that maybe the main reason using Rust worked is that it forced anyone who wanted to contribute to it to experience what using a language with a non-awful package manager is like.
Cargo is not really good. The very much non-zero frequency of something with cargo not working for opaque reasons and then suddenly working again after "cargo clean", the "no, I invoke your binaries"-mentality (try running a benchmark without either ^C'ing out of bench to copy the binary name or parsing some internal JSON metadata) because "cargo build" is the only build system in the world which will never tell you what it built, the whole mess with features, default-features, no-default-features, of course bindgen/sys dependency conflicts, "I'll just use the wrong -L libpath for the bin crate but if I'm building tests I remember the ...64". cargo randomly deciding that it now has to rebuild everything or 50% of everything for reasons which are never to be known, builds being not reproducible, cargo just never cleaning garbage up and so on.
rustdoc has only slightly changed since the 2010s, it's still very hard to figure out generic/trait-oriented APIs, and it still only does API documentation in mostly the same basic 1:1 "list of items" style. Most projects end up with two totally disjointed sets of documentation, usually one somewhere on github pages and the rustdoc.
Rust is overall good language, don't get me wrong. But it and the ecosystem also has a ton of issues (and that's without even mentioning async), and most of these have been sticking around since basically 1.0.
(However, the rules around initialization are just stupid and unsafe is no good. Rust also tends to favor a very allocation-heavy style of writing code, because avoiding allocations tends to be possible but often annoying and difficult in unique-to-rust ways. For somewhat related reasons, trivial things are at times really hard in Rust for no discernible reason. As a concrete, simplistic but also real-world example, Vec::push is an incredibly pessimistic method, but if you want to get around it, you either have to initialize the whole Vec, which is a complete waste of cycles, or you yolo it with reserve+set_len, which is invalid Rust because you didn't properly use MaybeUninit for locations which are only ever written.)
1 reply →
It just has to do with values. If you value perf you aren't going to write it in Python. And if you value perf then everything else becomes a no brainer as well.
It's the same way in JS land. You can make a game in a few kilobytes, but most web pages are still many megabytes for what should have been no JS at all.
> the conversation here keeps collapsing back to "Rust rewrite good/bad." That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
I think there's a few things going on here:
- If you're going have a project that's obsessed with speed, you might as well use rust/c/c++/zig/etc to develop the project, otherwise you're always going to have python and the python ecosystem as a speed bottleneck. rust/c/c++/zig ecosystems generally care a lot about speed, so you can use a library and know that it's probably going to be fast.
- For example, the entire python ecosystem generally does not put much emphasis on startup time. I know there's been some recent work here on the interpreter itself, but even modules in the standard library will pre-compile regular expressions at import time, even if they're never used, like the "email" module.
- Because the python ecosystem doesn't generally optimize for speed (especially startup), the slowdowns end up being contagious. If you import a library that doesn't care about startup time, why should your library care about startup time? The same could maybe be said for memory usage.
- The bootstrapping problem is also mostly solved by using a complied language like c/rust/go. If the package manager is written in python (or even node/javascript), you first have to have python+dependencies installed before you can install python and your dependencies. With uv, you copy/install a single binary file which can then install python + dependencies and automatically do the right thing.
- I think it's possible to write a pretty fast implementation using python, but you'd need to "greenfield" it by rewriting all of the dependencies yourself so you can optimize startup time and bootstrapping.
- Also, as the article mentions there are _some_ improvements that have happened in the standards/PEPs that should eventually make they're way into pip, though it probably won't be quite the gamechanger that uv is.
> the entire python ecosystem generally does not put much emphasis on startup time.
You'd think PyPy would be more popular, then.
> even modules in the standard library will pre-compile regular expressions at import time, even if they're never used, like the "email" module.
Hmm, that is slower than I realized (although still just a fraction of typical module import time):
I agree the email module is atrocious in general, which specifically matters because it's used by pip for parsing "compiled" metadata (PKG-INFO in sdists, when present, and METADATA in wheels). The format is intended to look like email headers and be parseable that way; but the RFC mandates all kinds of things that are irrelevant to package metadata, and despite the streaming interface it's hard to actually parse only the things you really need to know.
> Because the python ecosystem doesn't generally optimize for speed (especially startup), the slowdowns end up being contagious. If you import a library that doesn't care about startup time, why should your library care about startup time? The same could maybe be said for memory usage.
I'm trying to fight this, by raising awareness and by choosing my dependencies carefully.
> you first have to have python+dependencies installed before you can install python and your dependencies
It's unusual that you actually need to install Python again after initially having "python+dependencies installed". And pip vendors all its own dependencies except for what's in the standard library. (Which is highly relevant to Debian getting away with the repackaging that it does.)
> I think it's possible to write a pretty fast implementation using python, but you'd need to "greenfield" it by rewriting all of the dependencies yourself so you can optimize startup time and bootstrapping.
This is my current main project btw. (No, I don't really care that uv already exists. I'll have to blog about why.)
> there are _some_ improvements that have happened in the standards/PEPs that should eventually make they're way into pip
Most of them already have, along with other changes. The 2025 pip experience is, believe it or not, much better than the ~2018 pip experience, notwithstanding higher expectations for ecosystem complexity.
> That feels like cargo-culting the toolchain [...]
Pun intended?
Jokes aside, what you describe is a common pattern. It's also why Google internally they used to get decent speedups from rewriting some old C++ project in Go for a while: the magic was mostly in the rewrite-with-hindsight.
If you put effort into it, you can also get there via an incremental refactoring of an existing system. But the rewrite is probably easier to find motivation for, I guess.
Consensus building and figuring out what was actually needed?
Someone on this site said most tech problems are people problems - this feels like one.
Greenfield mostly solves the problem because it's all new people.
I can't find the quote for this, but I remember Python maintainers wanted package installing and management to be separate things. uv did the opposite, and instead it's more like npm.
3 replies →
Because it broke backwards compatibility? It's worth noting that setuptools is in a similar situation to pip, where any change has a high chance of breaking things (as can be seen by perusing the setuptools and pip bug trackers). PEP 517/518 removed the implementation-defined nature of the ecosystem (which had caused issues for at least a decade, see e.g. the failures of distutils2 and bento), instead replacing it with a system where users complain about which backend to use (which is at least an improvement on the previous situation)...
I suspect that the non-Rust improvements are vastly more important than you’re giving credit for. I think the go version would be 5x or 8x compared to the 10x, maybe closer. It’s not that the Rust parts are insignificant but the algorithmic changes eliminate huge bottlenecks.
Though Rust probably helps getting the design right, instead of fighting it.
From having sum-types to also having a reasonable packaging system itself.
> That feels like cargo-culting the toolchain instead of asking the uncomfortable question: why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
This feels like a very unfair take to me. Uv didn’t happen in isolation, and wasn’t the first alternative to pip. It’s built on a lot of hard work by the community to put the standards in place, through the PEP process, that make it possible.
What uv did was to bring it all together.
The point stands that it's less about the language than doing said hard work in any reasonable programming language.
Poetry largely accomplished the same thing first with most of the speedups (except managing your python installations) and had the disadvantage of starting before the PEPs you mentioned were standardized.
I don't know the problem space and I'm sure that the language-agnostic algorithmic improvements are massive. But to me, there's just something about rust that promotes fast code. It's easy to avoid copies and pointer-chasing, for example. In python, you never have any idea when you're copying, when you're chasing a pointer, when you're allocating, and so on. (Or maybe you do, but I certainly don't.) You're so far from hardware that you start thinking more abstractly and not worrying about performance. For some things, that's probably perfect. But for writing fast code, it's not the right mindset.
Uv is great but seems still everyone is cargo culting Rust. We still have Poetry and PDM.
The thing is that a lot of the bottlenecks in pip are entirely artificial, and a lot of the rest can't really be improved by rewriting in Rust per se, because they're already written in C (within the Python interpreter itself).
> it's how much speed we "unlocked" just by finally treating Python packaging as a well-specified systems problem instead of a pile of historical accidents.
A lot of that, in turn, boils down to realizing that it could be fast, and then expecting that and caring enough about it.
> but with the same design decisions (PEP 517/518/621/658 focus, HTTP range tricks, aggressive wheel-first strategy, ignoring obviously defensive upper bounds, etc.), I strongly suspect we'd be debating a 1.3× vs 1.5× speedup instead of a 10× headline
I'm doing a project of this sort (although I'm hoping not to reinvent the wheel (heh) for the actual resolution algorithm). I fully expect that some things will be barely improved or even slower, but many things will be nearly as fast as with uv.
For example, installing from cache (the focus for the first round) mainly relies on tools in the standard library that are written in C and have to make system calls and interact with the filesystem; Rust can't do a whole lot to improve on that. On the other hand, a new project can improve by storing unpacked files in the cache (like uv) instead of just the artifact (I'm storing both; pip stores the artifact, but with a msgpack header) and hard-linking them instead of copying them (so that the system calls do less I/O). It can also improve by actually making the cached data accessible without a network call (pip's cache is an HTTP cache; contacting PyPI tells it what the original download URL is for the file it downloaded, which is then hashed to determine its path).
For another example, pre-compiling bytecode can be parallelized; there's even already code in the standard library for it. Pip hasn't been taking advantage of that all this time, but to my understanding it will soon feature its own logic (like uv does) to assign files to compile to worker processes. But Rust can't really help with the actual logic being parallelized, because that, too, is written purely in C (at least for CPython), within the interpreter.
> why did it take a greenfield project to give Python the package manager behavior people clearly wanted for the last decade?
(Zeroth, pip has been doing HTTP range tricks, or at least trying, for quite a while. And the exact point of PEP 658 is to obsolete them. It just doesn't really work for sdists with the current level of metadata expressive power, as in other PEPs like 440 and 508. Which is why we have more PEPs in the pipeline trying to fix that, like 725. And discussions and summaries like https://pypackaging-native.github.io/.)
First, you have to write the standards. People in the community expect interoperability. PEP 518 exists specifically so that people could start working on alternatives to Setuptools as a build backend, and PEP 517 exists so that such alternatives could have the option of providing just the build backend functionality. (But the people making things like Poetry and Hatch had grander ideas anyway.)
But also, consider the alternative: the only other viable way would have been for pip to totally rip apart established code paths and possibly break compatibility. And, well, if you used and talked about Python at any point between 2006 and 2020, you should have the first-hand experience required to complete that thought.
Specifically regarding the "aggressive wheel-first strategy", I strongly encourage you to read the discussion on https://github.com/pypa/pip/issues/9140.
I have been a big Astral and uv booster for a long time. But specifications like this one: https://gist.github.com/b7r6/47fea3c139e901cd512e15f42355f26... have me re-evaluating everything.
That's TensorRT-LLM in it's entirety at 1.2.0rc6 locked to run on Ubuntu or NixOS with full MPI and `nvshmem`, the DGX container Jensen's Desk edition (I know because I also rip apart and `autopatchelf` NGC containers for repackaging on Grace/SBSA).
It's... arduous. And the benefit is what exactly? A very mixed collection of maintainers have asserted that software behavior is monotonic along a single axis most of which they can't see and we ran a solver over those guesses?
I think the future is collections of wheels that have been through a process the consumer regards as credible.
I think this post does a really good job of covering how multi-pronged performance is: it certainly doesn't hurt uv to be written in Rust, but it benefits immensely from a decade of thoughtful standardization efforts in Python that lifted the ecosystem away from needing `setup.py` on the hot path for most packages.
Someone once told me a benefit of staffing a project for Haskell was it made it easy to select for the types of programmers that went out of their way to become experts in Haskell.
Tapping the Rust community is a decent reason to do a project in Rust.
It's an interesting debate. The flip side of this coin is getting hires who are more interested in the language or approach than the problem space and tend to either burn out, actively dislike the work at hand, or create problems that don't exist in order to use the language to solve them.
With that said, Rust was a good language for this in my experience. Like any "interesting" thing, there was a moderate bit of language-nerd side quest thrown in, but overall, a good selection metric. I do think it's one of the best Rewrite it in X languages available today due to the availability of good developers with Rewrite in Rust project experience.
The Haskell commentary is curious to me. I've used Haskell professionally but never tried to hire for it. With that said, the other FP-heavy languages that were popular ~2010-2015 were absolutely horrible for this in my experience. I generally subscribe to a vague notion that "skill in a more esoteric programming language will usually indicate a combination of ability to learn/plasticity and interest in the trade," however, using this concept, I had really bad experiences hiring both Scala and Clojure engineers; there was _way_ too much academic interest in language concepts and way too little practical interest in doing work. YMMV :)
4 replies →
Paul Graham said the same thing about Python 20 years ago [1], and back then it was true. But once a programming langauge hits mainstream, this ceases to be a good filter.
[1] https://paulgraham.com/pypar.html
5 replies →
I'm my experience this is definitely where rust shined. The language wasn't really what made the project succeed so much as having relatively curious, meticulous, detail-oriented people on hand who were interested in solving hard problems.
Sometimes I thought our teams would be a terrible fit for more cookie-cutter applications where rapid development and deployment was the primary objective. We got into the weeds all the time (sometimes because of rust itself), but it happened to be important to do so.
Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.
15 replies →
I think a lot of rust rewrites have this benefit; if you start with hindsight you can do better more easily. Of course, rust is also often beneficial for its own sake, so it's a one-two punch:)
Succinctly, perhaps with some loss of detail:
"Rewrite" is important as "Rust".
2 replies →
> I think a lot of rust rewrites have this benefit
I think Rust itself has this benefit
Completely agreed!
Rust rewrites are known for breaking (compatibility with) working software. That's all there is to them.
7 replies →
Got it so, because it is rust it is good.. 10-4!!
> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.
Isn't assigning out what all made things fast presumptive without benchmarks? Yes, I imagine a lot is gained by the work of those PEPs. I'm more questioning how much weight is put on dropping of compatibility compared to the other items. There is also no coverage for decisions influenced by language choice which likely influences "Optimizations that don’t need Rust".
This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
> Isn't assigning out what all made things fast presumptive without benchmarks?
We also have the benchmark of "pip now vs. pip years ago". That has to be controlled for pip version and Python version, but the former hasn't seen a lot of changes that are relevant for most cases, as far as I can tell.
> This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML. A package installer really only needs to look at it at all when building from source, and part of what these standards have done for us is improve wheel coverage. (Other relevant PEPs here include 600 and its predecessors.) Although that has also largely been driven by education within the community, things like e.g. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... and https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-... .
> This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML.
I don't know the details of Python's resolution algorithm, but for Cargo (which is where epage is coming from) a lockfile (which is encoded in TOML) can be somewhat large-ish, maybe pushing 100 kilobytes (to the point where I'm curious if epage has benchmarked to see if lockfile parsing is noticeable in the flamegraph).
2 replies →
To be fair, the whole post isn't very good IMO, regardless of ChatGPT involvement, and it's weird how some people seem to treat it like some kind of revelation.
I mean, of course it wasn't specifically Rust that made it fast, it's really a banal statement: you need only very moderate serious programming experience to know, that rewriting legacy system from scratch can make it faster even if you rewrite it in a "slower" language. There have been C++ systems that became faster when rewritten in Python, for god's sake. That's what makes system a "legacy" system: it does a ton of things and nobody really knows what it does anymore.
But when listing things that made uv faster it really mentions some silly things, among others. Like, it doesn't parse pip.conf. Right, sure, the secret of uv's speed lies in not-parsing other package manager's config files. Great.
So all in all, yes, no doubt that hundreds of little things contributed into making uv faster, but listing a few dozens of them (surely a non-exhaustive lists) doesn't really enable you to make any conclusions about the relative importance of different improvements whatsoever. I suppose the mentioned talk[0] (even though it's more than a year old now) would serve as a better technical report.
[0] https://www.youtube.com/watch?v=gSKTfG1GXYQ
The content is nice and insightful! But God I wish people stopped using LLMs to 'improve' their prose... Ironically, some day we might employ LLMs to re-humanize texts that had been already massacred.
The author’ blog was on HN a few days ago as well for an article on SBOMs and Lockfiles. They’ve done a lot of work in the supply-chain security side and are clearly knowledgeable, and yet the blog post got similarly “fuzzified” by the LLM.
There are a handful of things in TFA that, while not outright false, are sloppy enough that I'd expect someone knowledgeable to know/explain better.
1 reply →
Editing the post to switch five "it's X not Y"s[1] is pretty disappointing. I wish people were more clear with their disclosure of LLM editing.
[1]: https://github.com/andrew/nesbitt.io/commit/0664881a524feac4...
I recsind my previous statement. Also, people have to stop putting everything on github.
This is terrible. So disrespectful. It's baffling how someone can do this under their own name
Interestingly I didn’t catch this, I liked it for not looking LLM written!
“Why this matters” being the final section is a guaranteed give away, among innumerable others.
3 replies →
To me, unless it is egregious, I would be very sensitive to avoid false positives before saying something is LLM aided. If it is clearly just slop, then okay, but I definitely think there is going to be a point where people claim well-written, straightforward posts as LLM aided. (Or even the opposite, which already happens, where people purposely put errors in prose to seem genuine).
there is going to be a point where people have read so much slop that they will start regurgitating the same style without even realising it. or we could already be at that point
I have reached a point where any AI smell (of which this articles has many) makes me want to exit immediately. It feels tortuous to my reading sensibilities.
I blame fixed AI system prompts - they forcibly collapse all inputs into the same output space. Truly disappointing that OpenAI et all have no desire to change this before everything on the internet sounds the same forever.
You're probably right about the latter point, but I do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.
As you said, reading this stuff is taxing. What's more, this is a daily occurrence by now. If there's a silver lining, it's that the LLM smells are so obvious at the moment; I can close the tab as soon as I notice one.
3 replies →
I also don't read AI slop. It's disrespectful to any reader.
> Ironically, some day we might employ LLMs to re-humanize texts
I heard high school and college students are doing this routinely so their papers don't get flagged as AI
this is whether they used an LLM for the whole assignment or wrote it themselves, has to get pass through a "re-humanizing" LLM either way just to avoid drama
> Zero-copy deserialization. uv uses rkyv to deserialize cached data without copying it. The data format is the in-memory format. This is a Rust-specific technique.
This (zero-copy deserialization) is not a rust-specific technique, so I'm not entirely sure why the author describes it as one. Any good low level language (C/C++ included) can do this from my experience.
Given the context of the article, I think "Rust specific" here means that "it couldn't be done in python".
For example "No interpreter startup" is not specific to Rust either.
I think the framing in the post is that it's specific to Rust, relative to what Python packaging tools are otherwise written in (Python). It's not very easy to do zero-copy deserialization in pure Python, from experience.
(But also, I think Rust can fairly claim that it's made zero-copy deserialization a lot easier and safer.)
I suppose it can fairly claim that now every other library and blog post invokes "zero-copy" this and that, even in the most nonsensical scenarios. It's a technique for when you can literally not afford the memory bandwidth, because you are trying to saturate a 100Gbps NIC or handling 8k 60Hz video, not for compromising your data serialization schemes portability for marketing purposes while all applications hit the network first, disk second and memory bandwidth never.
3 replies →
I can't even imagine what "safety" issue you have in mind. Given that "zero-copy" apparently means "in-memory" (a deserialized version of the data necessarily cannot be the same object as the original data), that's not even difficult to do with the Python standard library. For example, `zipfile.ZipFile` has a convenience method to write to file, but writing to in-memory data is as easy as
(That does, of course, copy data around within memory, but.)
6 replies →
It's Rust vs Python in this case.
They speak about “technique” but rkyv is a Rust-specific format. Could be an editing error or maybe they’re suggesting it’s more difficult in python.
It seems to me more like a "LLM failing to grasp the true importance of a point" error.
> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence.
pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.
For example, my employer (Datadog) allowed me and two other engineers to improve various aspects of Python packaging for nearly an entire quarter. One of the items was to satisfy a few long-standing pip feature requests. I discovered that the cross-platform resolution feature I considered most important is basically incompatible [1] with the current code base. Maintainers would have to decide which path they prefer.
[1]: https://github.com/pypa/pip/issues/13111
> pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.
Backwards compatibility is the one thing that prevents the code in an older project from being replaced with a better approach in situ. It cannot be more difficult than a rewrite, except that rewrites (arguably including my project) may hold themselves free to skip hard legacy cases, at least initially (they might not be relevant by the time other code is ready).
(I would be interested in hearing from you about UX designs for cross-platform resolution, though. Are you just imagining passing command-line flags that describe the desired target environment? What's the use case exactly — just making a .pylock file? It's hard to imagine cross-platform installation....)
My favorite speed up trick: “ HTTP range requests for metadata. Wheel files are zip archives, and zip archives put their file listing at the end. uv tries PEP 658 metadata first, falls back to HTTP range requests for the zip central directory, then full wheel download, then building from source. Each step is slower and riskier. The design makes the fast path cover 99% of cases. None of this requires Rust.”
> None of this requires Rust.
Indeed. As demonstrated by the fact that pip has been doing exactly the same for years.
Part of the reason things are improving is that "tries PEP 658 metadata first" is more likely to succeed, and at some point build tools may have become more aware of how pip expects the zip to be organized (see https://packaging.python.org/en/latest/specifications/binary...), and way more projects ship wheels (because the manylinux standard has improved, and because pure-Python devs have become aware of things like https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-...).
> Ignoring requires-python upper bounds. When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
Erm, isn't this a bit bad?
Yes, but it's (probably) the least worse thing they can do given how the "PyPI" ecosystem behaves. As PyPI does not allow replacement of artefacts (sdists, wheels, and older formats), and because there is no way to update/correct metadata for the artefacts, unless the uploader knew at upload time of incompatibilities between their package and and the upper-bounded reference (whether that is the Python interpreter or a Python package), the upper bound does not reflect a known incompatibility. In addition, certain tools (e.g. poetry) added the upper bounds automatically, increasing the amount of spurious bounds. https://iscinumpy.dev/post/bound-version-constraints/ provides more details.
The general lesson from this is when you do not allow changes/replacement of invalid data (which is a legitimate thing to do), then you get stuck with handling the bad data in every system which uses it (and then you need to worry about different components handling the badness in different ways, see e.g. browsers).
No. When such upper bounds are respected, they contaminate other packages, because you have to add them yourself to be compatible with your dependencies. Then your dependents must add them too, etc. This brings only pain. Python 4 is not even a thing, core developers say there won't ever be a Python 4.h
> you have to add them yourself to be compatible with your dependencies
This is no more true for version upper bounds than it is for version lower bounds, assuming that package installers ensure all package version constraints are satisfied.
I presume you think version lower bounds should still be honoured?
> PEP 658 went live on PyPI in May 2023. uv launched in February 2024. The timing isn’t coincidental. uv could be fast because the ecosystem finally had the infrastructure to support it. A tool like uv couldn’t have shipped in 2020. The standards weren’t there yet.
How/why did the package maintainers start using all these improvements? Some of them sound like a bunch of work, and getting a package ecosystem to move is hard. Was there motivation to speed up installs across the ecosystem? If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
> If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
It wasn't working okay for many people, and many others haven't started using pyproject.toml.
For what I consider the most egregious example: Requests is one of the most popular libraries, under the PSF's official umbrella, which uses only Python code and thus doesn't even need to be "built" in a meaningful sense. It has a pyproject.toml file as of the last release. But that file isn't specifying the build setup following PEP 517/518/621 standards. That's supposed to appear in the next minor release, but they've only done patch releases this year and the relevant code is not at the head of the repo, even though it already caused problems for them this year. It's been more than a year and a half since the last minor release.
That's really unfortunate, and it sounds like a quick thing to fix. Is there a pull request with that?
I should have mentioned one of the main reasons setup.py turns out not okay for people (aside from the general unpleasantness of running code to determine what should be, and mostly is, static metadata): in the legacy approach, Setuptools has to get `import`ed from the `setup.py` code before it can run, but running that code is the way to find out the dependencies. Including build-time dependencies. Specifically Setuptools itself. Good luck if the user's installed version is incompatible with what you've written.
Hmm... poetry got me into using pyproject.toml, and with that migrating to uv was surprisingly easy.
Because static declaration was clearly safer and more performant? My question is why pip isn't fully taking advantage
Because pip contains decades of built-up code and lacks the people willing to work on updating it.
I like the implication that we can have an alternative to uv speed-wise, but I think reliability and understandability are more important in this context (so this comment is a bit off-topic).
What I want from a package manager is that it just works.
That's what I mostly like about uv.
Many of the changes that made speed possible were to reduce the complexity and thus the likelihood of things not working.
What I don't like about uv (or pip or many other package managers), is that the programmer isn't given a clear mental model of what's happening and thus how to fix the inevitable problems. Better (pubhub) error messages are good, but it's rare that they can provide specific fixes. So even if you get 99% speed, you end up with 1% perplexity and diagnostic black boxes.
To me the time that matters most is time to fix problems that arise.
> the programmer isn't given a clear mental model of what's happening and thus how to fix the inevitable problems.
This is a priority for PAPER; it's built on a lower-level API so that programmers can work within a clear mental model, and I will be trying my best to communicate well in error messages.
I remain baffled about these posts getting excited about uv’s speed. I’d like to see a real poll but I personally can’t imagine people listing speed as one of the their top ten concerns about python package managers. What are the common use cases where the delay due to package installation is at all material?
Edit to add: I use python daily
At a previous job, I recall updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem. Was not an enjoyable experience.
uv has been a delight to use
> updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem
I'd characterize that as unusable, for sure.
Working heavily in Python for the last 20 years, it absolutely was a big deal. `pip install` has been a significant percentage of the deploy time on pretty much every app I've ever deployed and I've spent countless hours setting up various caching techniques trying to speed it up.
I can run `uvx sometool` without fear because I know that it'll take a few seconds to create a venv, download all the dependencies, and run the tool. uv's speed has literally changed how I work with Python.
I wouldn't say without fear, since you're one typo away from executing a typo-squatted malicious package.
I do use it on CI/CD pipelines, but I wouldn't dare type uvx commands myself on a daily basis.
3 replies →
`poetry install` on my dayjob’s monolith took about 2 minutes, `uv sync` takes a few seconds. Getting 2 minutes back on every CI job adds up to a lot of time saved
As a multi decade Python user, uv's speed is "life changing". It's a huge devx improvement. We lived with what came before, but now that I have it, I would never want to go back and it's really annoying to work on projects now that aren't using it.
Docker builds are a big one, at least at my company. Any tool that reduces wait time is worth using, and uv is an amazing tool that removes that wait time. I take it you might not use python much as it solves almost every pain point, and is fast which feels rare.
CI: I changed a pipeline at work from pip and pipx to uv, it saved 3 minutes on a 7 minute pipeline. Given how oversubscribed our runners are, anything saving time is a big help.
It is also really nice when working interactivly to have snappy tools that don't take you out of the flow more than absolutely more than necessary. But then I'm quite sensitive to this, I'm one of those people who turn off all GUI animations because they waste my time and make the system feel slow.
It's not just about delays being "material"; waiting on the order of seconds for a venv creation (and knowing that this is because of pip bootstrapping itself, when it should just be able to install cross-environment instead of having to wait until 2022 for an ugly, limited hack to support that) is annoying.
But small efficiencies do matter; see e.g. https://danluu.com/productivity-velocity/.
I avoided Python for years, especially because of package and environment management. Python is now my go to for projects since discovering uv, PEP 723 metadata, and LLMs’ ability to write Python.
The speed is nice, but I switched because uv supports "pip compile" from pip-tools, and it is better at resolving dependencies. Also pip-tools uses (used?) internal pip methods and breaks frequently because of that, uv doesn't.
Speed is one of the main reasons why I keep recommending uv to people I work with, and why I initially adopted it: Setting up a venv and installing requirements became so much faster. Replacing pipx and `uv run` for single-file scripts with external dependencies, were additional reasons. With nox adding uv support, it also became much easier and much faster to test across multiple versions of Python
Setting up a new dev instance took 2+ hours with pip at my work. Switching to uv dropped the Python portion down to <1 minute, and the overall setup to 20 minutes.
A similar, but less drastic speedup applied to docker images.
One weird case where this mattered to me, I wanted pip to backtrack to find compatible versions of a set of deps, and it wasn't done after waiting a whole hour. uv did the same thing in 5 minutes. This might be kinda common because of how many Python repos out there don't have pinned versions in dependencies.txt.
for me it's being able to do `uv run whatever` and always know I have the correct dependencies
(also switching python version is so fast)
The biggest benefit is in CI environments and Docker images and the like where all packages can get reinstalled on every run.
Build jobs where you have a lot of dependencies. Those GHA minutes go brrrr.
conda can take an hour to tell you your desired packages are unsatisifiable
saying that, other than the solver, most of what uv does is always going to be IO bound
People criticising conda's solver prove they haven't used it in years.
Do you still remain baffled after the many replies that people actually do like their tooling to be not dog slow like pip is?
It's annoying. Do you use poetry? Pipenv? It's annoying.
There's an interesting psychology at play here as well, if you are a programmer that chooses a "fast language" it's indicative of your priorities already, it's often not much the language, but that the programmer has decided to optimize for performance from the get go.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.
Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)
No, because Python itself will generate bytecode for packages once you actually import them. uv just defers that to first-import time, but the cost is amortized in any setting where imports are performed over multiple executions.
That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.
My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.
12 replies →
That’s actually a negative:
My Docker build generating the byte code saves it to the image, sharing the cost at build time across all image deployments — whereas, building at first execution means that each deployed image instance has to generate its own bytecode!
That’s a massive amplification, on the order of 10-100x.
“Well just tell it to generate bytecode!”
Sure — but when is the default supposed to be better?
Because this sounds like a massive footgun for a system where requests >> deploys >> builds. That is, every service I’ve written in Python for the last decade.
Yes, uv skipping this step is a one time significant hit to start up time. E.g. if you're building a Dockerfile I'd recommend setting `--compile-bytecode` / `UV_COMPILE_BYTECODE`
Historically the practice of producing pyc files on install started with system wide installed packages, I believe, when the user running the program might lack privileges to write them. If the installer can write the .oy files it can also write the .pyc, while the user running them might not in that location.
This optimization hits serverless Python the worst. At Modal we ensure users of uv are setting UV_COMPILE_BYTECODE to avoid the cold start penalty. For large projects .pyc compilation can take hundreds of milliseconds.
> I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?
They do.
> Are we losing out on performance of the actual installed thing, then?
When you consciously precompile Python source files, you can parallelize that process. When you `import` from a `.py` file, you only get that benefit if you somehow coincidentally were already set up for `multiprocessing` and happened to have your workers trying to `import` different files at the same time.
If you have a dependency graph large enough for this to be relevant, it almost certainly includes a large number of files which are never actually imported. At worst the hit to startup time will be equal to the install time saved, and in most cases it'll be a lot smaller.
> a large number of files which are never actually imported
Unfortunately, it typically doesn't work out as well as you might expect, especially given the expectation of putting `import` statements at the top of the file.
> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
This is kind of fascinating. I've never considered runtime upper bound requirements. I can think of compelling reasons for lower bounds (dropping version support) or exact runtime version requirements (each version works for exact, specific CPython versions). But now that I think about it, it seems like upper bounds solve a hypothetical problem that you'd never run into in practice.
If PSF announced v4 and declared a set of specific changes, I think this would be reasonable. In the 2/3 era it was definitely reasonable (even necessary). Today though, it doesn't actually save you any trouble.
I think the article is being careful not to say uv ignores _all_ upper bound checks, but specifically 4.0 upper bound checks. If a package says it requires python < 3.0, that's still super relevant, and I'd hope for uv to still notice and prevent you from trying to import code that won't work on python 3. Not sure what it actually does.
I read the article as saying it ignores all upper-bounds, and 4.0 is just an example. I could be wrong though - it seems ambiguous to me.
But if we accept that it currently ignores any upper-bounds checks greater than v3, that's interesting. Does that imply that once Python 4 is available, uv will slow down due to needing to actually run those checks?
2 replies →
The problem: The specification is binary. Are you compatible or not?
That is unanswerable now, whether a python package will be compatible with a version that is not released.
Having an ENUM like [compatible, incompatible, untested] at the least would fix this.
Amazing that how much python's pip was so bottlenecked, it was basic design problem damn
> Virtual environments required
This bothers me more than once when building a base docker image. Why would I want a venv inside a docker with root?
The old package managers messing up the global state by default is the reason why Docker exists. It's the venv for C.
Because a single docker image can run multiple programs that have mutually exclusive dependencies?
Personally I never want program to ever touch global shared libraries ever. Yuck.
> a single docker image can run multiple programs
You absolutely can. But it's not best practice.
https://docs.docker.com/engine/containers/multi-service_cont...
1 reply →
> This reduces resolver backtracking dramatically since upper bounds are almost always wrong.
I am surprised by this because Python minor versions break backwards compatibility all the time. Our company for example is doing a painful upgrade from py39 to py311
Could you explain what major pain points you've encountered? I can't think of any common breakages cited in 3.10 or 3.11 offhand. 3.12 had a lot more standard library removals, and the `match` statement introduced in 3.10 uses a soft keyword and won't break code that uses `match` as an identifier.
At Plotly we did a decent amount of benchmarking to see how much the different defaults `uv` uses lead to its performance. This was necessary so we could advise our enterprise customers on the transition. We found you lost almost all of the speed gains if you configured uv behave as much like pip as you could. A trivial example is the precompile flag, which can easily be 50% of pips install time for a typical data science venv.
https://plotly.com/blog/uv-python-package-manager-quirks/
The precompilation thing was brought up to the uv team several months ago IIRC. It doesn't make as much of a difference for uv as for pip, because when uv is told to pre-compile it can parallelize that process. This is easily done in Python (the standard library even provides rudimentary support, which Python's own Makefile uses); it just isn't in pip yet (I understand it will be soon).
This post is excellent. I really like reading deep dives like this that take a complex system like uv and highlight the unique design decisions that make it work so well.
I also appreciate how much credit this gives the many previous years of Python standards processes that enabled it.
Update: I blogged more about it here, including Python recreations of the HTTP range header trick it uses and the version comparison via u64 integers: https://simonwillison.net/2025/Dec/26/how-uv-got-so-fast/
Some of these speed ups looked viable to backport into pip including parallel download, delayed .pyc, ignore egg, version checks.
Not that I'd bother since uv does venv so well. But, "it's not all rust runtime speed" implies pip could be faster too.
I have to say it's just lovely seeing such a nicely crafted and written technical essay. It's so obvious that this is crafted by hand, and reading it just reemphasises how much we've lost because technical bloggers are too ready to hand the keys over to LLMs.
This post was very clearly written with an LLM.
> Zero-copy deserialization
Just a nit on this section: zero-copy deserialization is not Rust specific (see flatbuffers). rkyv as a crate for doing so in Rust is though
> Every code path you don’t have is a code path you don’t wait for.
No, every code path you don't execute is that. Like
> No .egg support.
How does that explain anything if the egg format is obsolete and not used?
Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.
And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!
But otherwise great and seemingly comprehensive list!
> No, every code path you don't execute is that.
Even in compiled languages, binaries have to get loaded into memory. For Python it's much worse. On my machine:
Almost all of that extra time is either the module import process or garbage collection at the end. Even with cached bytecode, the former requires finding and reading from literally hundreds of files, deserializing via `marshal.loads` and then running top-level code, which includes creating objects to represent the functions and classes.
It used to be even worse than this; in recent versions, imports related to Requests are deferred to the first time that an HTTPS request is needed.
> binaries have to get loaded into memory.
Unless memory mapped by the OS with no impact on runtime for unused parts?
> imports related to Requests are deferred
Exactly, so again have no impact?
3 replies →
So... will uv make Python a viable cross-platform utility solution?
I was going to learn Python for just that (file-conversion utilities and the like), but everybody was so down on the messy ecosystem that I never bothered.
I write all of my scripts in Python with PEP 723 metadata and run them with `uv run`. Works great on Windows and Linux for me.
It has been viable for a long time, and the kinds of projects you describe are likely well served by the standard library.
Yes, uv basically solves the terrible Python tooling situation.
In my view that was by far the biggest issue with Python - a complete deal-breaker really. But uv solves it pretty well.
The remaining big issues are a) performance, and b) the import system. uv doesn't do anything about those.
Performance may not be an issue in some cases, and the import system is ... tolerable if you're writing "a python project". If you're writing some other project and considering using Python for its scripting system, e.g. to wrangle multiple build systems or whatever than the import mess is a bigger issue and I would thing long and hard before picking it over Deno.
Thanks! I don't really think about importing stuff (which maybe I should), because I assume I'll have to write any specialized logic myself. So... your outlook is encouraging.
I've talked about this many times on HN this year but got beaten to the punch on blogging it seems. Curses.
... Okay, after a brief look, there's still lots of room for me to comment. In particular:
> pip’s slowness isn’t a failure of implementation. For years, Python packaging required executing code to find out what a package needed.
This is largely refuted by the fact that pip is still slow, even when installing from wheels (and getting PEP 600 metadata for them). Pip is actually still slow even when doing nothing. (And when you create a venv and allow pip to be bootstrapped in it, that bootstrap process takes in the high 90s percent of the total time used.)
If UV team has a spare time, they should rewrite Python in Rust without any of the legacy baggage.
The article info is great, but why do people put up with LLM ticks and slop in their writing? These sentences add no value and treats the reader as stupid.
> This is concurrency, not language magic.
> This is filesystem ops, not language-dependent.
Duh, you literally told me that the previous sentence and 50 million other times.
This kind of writing goes deeper than LLM's, and reflects a decline in both reading ability, patience, and attention. Without passing judgement, there are just more people now who benefit from repetition and summarization embedded directly in the article. The reader isn't 'stupid', just burdened.
Indeed, I am coming around in the past few weeks to realization and acceptance that the LLM editorial voice is a benefit to an order of magnitude more hn readers than those (like us) for whom it is ice pick in the nostril stuff.
Oh well, all I can do is flag.
It’s fast because it sucks the life force from bad developers to make them into something good.
Jokes aside…
I really like uv but also really like mise and I cannot seem to get them to work well together.
Why? They are pretty compatible. Just set the venv in the project's mise.toml are you are good to go. Mise will activate it automatically when you change into the project directory.
I wish this were enough to get the flake8 devs to accept pyproject support PRs.
Stop using flake8 and use ruff instead. It's made by the same folks that make uv.
Soon uv will deliver results without you even thinking about them beforehand!
> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower.
I will bring popcorn on python 4 release date.
If it's really not doing any upper bound checks, I could see it blowing up under more mundane conditions; Python includes breaking changes on .x releases, so I've had eg. packages require (say) Python 3.10 when 3.11/12 was current.
I always bring popcorn on major version changes for any programming language. I hope Rust's never 2.0 stance holds.
It would be popcorn-worthy regardless, given the rhetoric surrounding the idea in the community.
I usually don't see the importance of speed in one-time costs... But hey, same discussion with npm, yarn, pnpm...
Other design decisions that made uv fast:
- uncompressing packages while they are still being downloaded, in memory, so that you only have to write to disk once
- design of its own locking format for speed
But yes, rust is actually making it faster because:
- real threads, no need for multi-processing
- no python VM startup overhead
- the dep resolution algo is exactly the type of workload that is faster in a compiled language
Source, this interview with Charlie Marsh: https://www.bitecode.dev/p/charlie-marsh-on-astral-uv-and-th...
The guy has a lot of interesting things to say.
> uncompressing packages while they are still being downloaded
... but the archive directory is at the end of the file?
> no python VM startup overhead
This is about 20 milliseconds on my 11-year-old hardware.
HTTP range strikes again.
As for 20 ms, if you deal with 20 dependencies in parallel, that's 400ms just to start working.
Shaving half a second on many things make things fast.
Althought as we saw with zeeek in the other comment, you likely don't need multiprocessing since the network stack and unzip in the stdlib release the gil.
Threads are cheaper.
Maybe if you'd bundle pubgrub as a compiled extension, you coukd get pretty close to uv's perf.
> real threads, no need for multi-processing
parallel downloads don't need multi-processing since this is an IO bound usecase. asyncio or GIL-threads (which unblock on IO) would be perfectly fine. native threads will eventually be the default also.
Indeed, but unzipping while downloading do. Analysing multiple metadata files and exporting lock data as well.
Now I believe unzip releases the GIL already so we could already benefit from that and the rest likely don't dominate perfs.
But still, rust software is faster on average than python software.
After all, all those things are possible in python, and yet we haven't seen them all in one package manager before uv.
Maybe the strongest advantage of rust, on top of very clean and fast default behaviors, is that it attracts people that care about speed, safety and correctness. And those devs are more likely to spend time implementing fast software.
Thought the main benefit of uv is not that it's fast. It's very nice, and opens more use cases, but it's not the killer feature.
The killer feature is, being a stand alone executable, it bypasses all python bootstrapping problems.
Again, that could technically be achieved in python, but friction is a strong force.
> Some of uv’s speed comes from Rust. But not as much as you’d think. Several key optimizations could be implemented in pip today: […] Python-free resolution
Umm…
I don't have any real disagreement with any of the details the author said.
But still, I'm skeptical.
If it is doable, the best way to prove it is to actually do it.
If no one implements it, was it ever really doable?
Even if there is no technical reason, perhaps there is a social one?
I guess you mean doing the things in Python that are supposedly doable from Python.
Yeah, to a zeroth approximation that's my current main project (https://github.com/zahlman/paper). Of course, I'm just some rando with apparently serious issues convincing myself to put in regular unpaid work on it, but I can see in broad strokes how everything is going to work. (I'm not sure I would have thought about, for example, hard-linking files when installing them from cache, without uv existing.)
What are you talking about, this all exists
wait, zero-copy deserialization isn't rust-specific. you can mmap structs in C. done it before, works fine
The point is that it would be difficult in Python, compared to in "system" compiled languages generally.
Mmm I don't buy it. Not many projects use setup.py now anyway and pip is still super slow.
> Plenty of tools are written in Rust without being notably fast.
This also hasn't been my experience. Most tools written in Rust are notably fast.
> Not many projects use setup.py now anyway and pip is still super slow.
Yes, but that's still largely not because of being written in Python. The architecture is really just that bad. Any run of pip that touches the network will end up importing more than 500 modules and a lot of that code will simply not be used.
For example, one of the major dependencies is Rich, which includes things like a 3600-entry mapping of string names to emoji; Rich in turn depends on Pygments which normally includes a bunch of rules for syntax highlighting in dozens of programming languages (but this year they've finished trimming those parts of the vendored Pygments).
Another thing is that pip's cache is an HTTP cache. It literally doesn't know how to access its own package download cache without hitting the network, and it does that access through wrappers that rely on cachecontrol and Requests.
> Any run of pip that touches the network will end up importing more than 500 modules and a lot of that code will simply not be used.
That's a property of Python though. The fact that it isn't compiled (and that importing is very slow).
> a 3600-entry mapping of string names to emoji
Which can easily be zero-cost in Rust.
> It literally doesn't know how to access its own package download cache without hitting the network
This is the only example you've given that actually fits with your thesis.
Mine either. Choosing Rust by no means guarantees your tool will be fast—you can of course still screw it up with poor algorithms. But I think most people who choose Rust do so in part because they aspire for their tool to be "blazing fast". Memory safety is a big factor of course, but if you didn't care about performance, you might have gotten that via a GCed (and likely also interpreted or JITed or at least non-LLVM-backend) language.
Yeah sometimes you get surprisingly fast Python programs or surprisingly slow Rust programs, but if you put in a normal amount of effort then in the vast majority of cases Rust is going to be 10-200x faster.
I actually rewrote a non-trivial Python program in Rust once because it was so slow (among other reasons), and got a 50x speedup. It was mostly just running regexes over logs too, which is the sort of thing Python people say is an ideal case (because it's mostly IO or implemented in C).
very nice article, always good to get a review of what a "simple" looking tool does behind the scense
about rust though
some say a nicer language helps finding the right architecture (heard that about cpp veteran dropping it for ocaml, any attempted idea would take weeks in cpp, was a few days in ocaml, they could explore more)
also the parallelism might be a benefit the language orientation
enough semi fanboyism
> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence. But it means pip will always be slower than a tool that starts fresh with modern assumptions.
what does backwards compatibility have to do with parallel downloads? or global caching? The metadata-only resolution is the only backwards compatible issue in there and pip can run without a setup.py file being present if pyproject.toml is there.
Short answer is most, or at least a whole lot, of the improvements in uv could be integrated into pip as well (especially parallelizing downloads). But they're not, because there is uv instead, which is also maintained by a for-profit startup. so pip is the loser
uv seems to be a pet peeve of HN. I always thought pipenv was good but yeah, seems like I was being ignorant
> uv seems to be a pet peeve of HN.
Unless I've been seeing very different submissions than you, "pet peeve" seems like the exact opposite of what is actually the case?
Indeed; I don't think he knows what "peeve" means...
I too use pipenv unless there's a reason not to. I hope people use whatever works best for them.
I feel that sometimes there's a desire on the part of those who use tool X that everyone should use tool X. For some types of technology (car seat belts, antibiotics...) that might be reasonable but otherwise it seems more like a desire for validation of the advocate's own choice.
My biggest complaint with pipenv is/was(?) that it's lockfile format only kept the platform identifiers of the platform you locked it on - so if you created it on Mac, then tried to install from the lockfile on a Linux box, you're building from source because it's only locked in wheels for MacOS.
Poetry and uv avoid this issue.
Came here to ask about pipenv. As someone who does not use python other than for scripting, but also appreciates the reproduceability that pipenv provides, should I be using uv? My understanding is that pipenv is the better successor to venv and pip (combined), but now everyone is talking about uv so to be honest it's quite confusing.
Edit: to add to what my understanding of pipenv is, the "standard/approved" method of package management by the python community, but in practice is it not? Is it now uv?
Great post, but the blatant chatgpt-esque feel hits hard… Don’t get me wrong, I love astral! and the content, but…
Reading the other replies here makes it really obvious that this is some LLM’s writing. Maybe even all of it…
> npm’s package.json is declarative
lol
AI slop
[dead]
TLDR: Because Rust.
This entire AI generated article with lots of text just to just say the obvious.
That conclusion is largely false, and is not what the article says.
this shit is ChatGPT-written and I'm really tired of it. If I wanted to read chatgpt I would have asked it myself. Half of the article are nonsensical repeated buzzwords thrown in for absolutely no reason
This is great to read because it validates my impression that Python packaging has always been a tremendous overengineered mess. Glad to see someone finally realized you just need a simple standard metadata file per package.
It has been realized in the Python community for a very long time. But there have been years of debate over the contents and formatting, and years of trying to figure out how to convince authors and maintainers to do the necessary work on their end, and years of trying to make sure the ecosystem doesn't explode from trying to remove legacy support.
There are still separate forms of metadata for source packages and pre-compiled distributions. This is necessary because of all the weird idiosyncratic conditional logic that might be necessary in the metadata for platform-specific dependencies. Some projects are reduced to figuring out the final metadata at build time, while building on the user's machine, because that's the only way to find out enough about the user's machine to make everything work.
It really isn't as straightforward as you'd expect, largely because Python code commonly interfaces to compiled code in several different languages, and end users expect this to "just work", including on Windows where they don't have a compiler and might not know what that is.
See https://pypackaging-native.github.io/ for the general flavour of it.
Our next trick, getting people to stop writing code (so we can stop writing python)