← Back to context

Comment by vshulcz

9 hours ago

The line I'd push on is "valid Python that runs unmodified in CPython." True at the syntax level, but the speedup isn't really coming from Nim as a backend, it comes from how much of Python's dynamism you can pin down at compile time. Refcounting semantics, __getattr__, values that change type at runtime, isinstance-based dispatch, monkeypatching in tests: the moment you statically commit to any of those, you've defined a subset, and the subset is the actual product, not the transpiler.

I've spent a fair bit of time generating specialized straight-line code for hot Python paths, killing the per-call attribute and dict lookups the interpreter does. The lesson was that dispatch-bound code claws back most of its overhead without ever leaving CPython. Where AOT-to-native actually pulls ahead is numeric and loop-bound work, where the interpreter loop and boxing dominate. Your 512x288 render is exactly that case, which is why it looks so strong.

So the benchmark I'd want isn't render time, it's what fraction of a real module transpiles with no rewrites. That number tells me whether this is a systems language or a fast path I have to hand-shape around. Codon and Shedskin both hit that wall. Curious where Nimic draws the line.

True, nimic is a statically typed Python subset without much of its dynamism with the aim to be foremost an efficient systems language. Though in nimic, without using isinstance, the instance-based dispatch is realised via variant types. Yes, emulating the Nim constructions in Python was the hardest part, while making the transpiler was straightforward.

Indeed, the AOT compilation leads to great speed-ups for heavy custom numeric calculations that cannot be easily vectorized in numpy, such as the raytracing logic. In cases when most of calculations are performed in an external module (written in e.g. C, Rust or Cython etc) the performance gain might be much less, but the added value here is that the high performance module itself can be written in nimic, keeping the codebase purely in Python and consolidating the codebase.

When starting with a "pythonic" code, rewrites in nimic can be substantial but so would be a rewrite in a systems language like C or Rust. Besides allowing to optimise the fast path, nimic provides low level functionality, such as pointers to pointers and bitwise operations that actually executes within CPython, for example, as in mp4 muxer implementation in dima-quant/ndsl_raytracer/src/nraytracer/minimp4.py

  • the variant-types-for-dispatch thing is a nice tradeoff, makes sense. honestly the strongest pitch in there is the "write the hot module in nimic instead of dropping to c/rust/cython" angle, which is basically the cython niche. so the question is what nimic adds over cython for that. my guess is dual-mode: a .pyx isn't valid python and won't run under plain cpython, but nimic stays importable and debuggable in cpython and you compile only when you want speed. if that's the edge i'd put it front and center, it's a way sharper sell than "python but fast".

    one thing i'd worry about with "runs unmodified in cpython" though: python ints are arbitrary precision and nim's aren't. so the same code can give you a bignum under cpython and a wraparound under the compiled path. how do you handle that, or is matching cpython semantics explicitly a non-goal for the typed subset?

    • > how do you handle that

      It looks like (please correct me, OP, if I'm wrong) it works the other way around: you use sized int types in Nimic code, and their semantics are emulated on the Python side. See here: https://github.com/dima-quant/nimic/blob/main/src/nimic/ntyp...

      So I'd say in Nimic Python you get Nim-style integer emulation when run from Python, keeping both paths consistent with each other - but breaking consistency with the rest of Python. Which is OK, I think, given it's explicitly a subset(s) of the language(s). It would be possible to make `int` transpile to some BigInt Nim implementation, but you'd need an external dependency for this, as they are not in Nim's stdlib. However, in the speed-focused context, I'm not sure if defaulting to BigInt every time the compiler sees an `: int` annotation would work well. It's a hard decision to make. Curious what's the OP opinion here?

Cython is the gold reference standard. There is no wall. You can selectively add type annotations and get optimized hot paths, or not and still get PyObject code that bypasses the interpreter for a free speedup.

Curious why your comment below’s got flagged. I personally don’t see any issue with it.

  • It happens; sometimes it's a misclick, sometimes some automated filter. You can 'vouch' for such comments if you go into that comment's details: https://news.ycombinator.com/item?id=48671561 - in this case, a combination of a new account and maybe posting too quickly could have caused it. I vouched for it, and it should be visible now.