Comment by yjftsjthsd-h
19 hours ago
That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.
My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.
I think they are making the bet that most modules won't be imported. For example if I install scipy, numpy, Pillow or such: what are the chances that I use a subset of the modules vs literally all of them?
I would bet on a subset for pretty much any non-trivial package (i.e. larger than one or two user facing modules). And for those trivial packages? Well they are usually small, so the cost is small as well. I'm sure there are exceptions: maybe a single gargantuan module thst consists of autogenerated FFI bindings for some C library or such, but that is likely the minority.
> It's only once so it's not persistently slower, but that is a perf hit.
Sure, but you pay that hit either way. Real-world performance is always usage based: the assumption that uv makes is that people run (i.e. import) packages more often than they install them, so amortizing at the point of the import machinery is better for the mean user.
(This assumption is not universal, naturally!)
Ummm, your comment is backwards, right?
Which part? The assumption is that when you `$TOOL install $PACKAGE`, you run (i.e. import) `$PACKAGE` more than you re-install it. So there's no point in slowing down (relatively less common) installation events when you can pay the cost once on import.
(The key part being that 'less common' doesn't mean a non-trivial amount of time.)
3 replies →
Probably for any case where an actual human is doing it. On an image you obviously want to do it at bake time, so I feel default off with a flag would have been a better design decision for pip.
I just read the thread and use Python, I can't comment on the % speedup attributed to uv that comes from this optimization.
Images are a good example where doing it at install-time is probably the best yeah, since every run of the image starts 'fresh', losing the compilation which happened last time the image got started.
If it was a optional toggle it would probably become best practice to activate compilation in dockerfiles.
> On an image you obviously want to do it at bake time
It seems like tons of people are creating container images with an installer tool and having it do a bunch of installations, rather than creating the image with the relevant Python packages already in place. Hard to understand why.
For that matter, a pre-baked Python install could do much more interesting things to improve import times than just leaving a forest of `.pyc` files in `__pycache__` folders all over the place.
You can change it to compile the bytecode on install with a simple environment variable (which you should do when building docker containers if you want to sacrifice some disk space to decrease initial startup time for your app).
you are right. it depends on how often this first start is, if its bad or not..most usecases id guess (total guess, have limited exp with python projects professionally) its not an issue.