It's weird, but the pain of moving to a whole new language seems less than the pain of using one of these language subsets. I guess it might be because of "loss aversion": we dislike losing something we have more the value we put on gaining it.
In a new language each newly added feature is a gain, but in a language subset you're always encountering things you would have in the full language. So no matter how good the rational case is, there is a bias against them. I can't think of a successful one actually - anyone?
> SPy does something different: on one hand, it removes the dynamic features which make Python "slow", but on the other hand it introduces new features which make it possible to implement and use the same pythonic patterns which we like.
The author seems unusually focused on explaining his new way, which may help it feel more like a new language with a coherent philosophy and unique benefits in its own right.
One of the standard architectures for complex applications is to put together a scripting language and an systems programming language. You can either look at the scripting language as primary with the systems language used for performance or hardware interfacing or you can look at the systems language being primary and the scripting being used around the edges.
The model of having a closely related scripting and systems language would be an optimization of this and SPy seems like an answer.
[1] AAA games that have a scripting engine on top of C/C++, a finite element solver that lets you set up a problem with Lua and solve it with FORTRAN, etc.
Yes, there are quite many real-world cases of this architecure. But, wouldn't it be better it the same language (more or less) can be used for both? I don't think such a language exists currently, but I think it would be a nice goal.
> in a language subset you're always encountering things you would have in the full language
Exactly. I once pitched the idea of a Python subset (with a different focus, not performance like SPy), and almost every reaction was "will it support <favourite Python feature / favourite library>".
For example, a new language can build its own solution for array math, or maybe that's not something its users need. OTOH many consider a Python subset to be unacceptable if it doesn't specifically support NumPy.
> positioning your project as an alternative implementation of something is a losing proposition
> don't go trying to create a subset of Python [...] Do your own thing. That way, you can evolve your system at your own pace and in your own direction, without being chained by expectations that your language should have to match the performance, feature set, or library ecosystem of another implementation.
As a counter-example, it feels as though Typescript has managed to (largely?) succeed as a subset of Javascript, so maybe the "do your own thing" isn't entirely a lost-cause -- maybe it's just really really difficult?
I would say that Rpython is successful. However, it's primary goal is to be whatever the PyPy developers need to write PyPy.
I totally agree about subsets though, I would much rather have a superset like Cython, but that has it's own challenges for compatibility if you wanted to have a "pure python" version of the same library. Which is why I really like that Cython is using typehints now.
Yeah Rpython escapes this as most people don't write it directly.
Micropython is another example. No-one expects micropython to support numpy. They're just happy to get away from C. But where you can use the full python, you wouldn't use micropython.
Throwing my hands up and moving to Nim was downright easy next to the excessive effort I put into trying out Nuitka, Numba, and PyInstaller for my use case. If you want static compilation, use a language and libraries built with that assumption as a ground rule. The herculean effort of building a half-compatible compiler for a dynamic language seems like a fool's errand, and would be a fun curiosity if so many people hadn't already tried it, especially with Python.
I was looking for someone else that had done this, I had the same exact experience.
That said, anyone looking into a completely static typed language that has nice ergonomics, is easy to pick up but has enough depth to keep you busy for weeks on end, and is versatile enough to be used for anything, do yourself a favor and give Nim a try.
which has high compatibility and relatively good performance for that kind of thing. The strength of Python is that so many people are trying things with it.
all of this is well and good if you completely forget that there are billions of lines of Python in prod right now. so your grand epiphany is basically on the level of "let's rewrite it in Rust". i'll let you hold your breath until that rewrite is done (and in the meantime i'll explore workarounds).
I remember chatting with one of the creators of PyPy (not the author of TFA) a number of years ago at HPI. He had just given a talk about how RPython was used in PyPy development, and I was fascinated.
To me, it seemed completely obvious that RPython itself seemed like a really interesting standalone language, but he would have none of it.
Whenever I suggested that RPython might have advantages over PyPy he insisted that PyPy was better and, more strangely, just as fast. Which was sort of puzzling, because the reason given for RPython was speed. When I then suggested that they could (after bootstrap) just use PyPy without the need for RPython, he insisted that PyPy was too slow for that to be feasible.
The fact that both of these statements could not really be true at the same time did not register.
I have asked about using RPython as a generic standalone language before. I think the official statement is that is was never intended to become one, and it's really a very minimal subset of Python (so basically no existing Python code will run, it would require heavy refactoring or complete rewrite), and it's only specifically those features that they currently need, and it might also be a moving target, and they don't want to give certain guarantees on stability of the language etc.
Once you consider that you anyway need to write very different kind of code for RPython, then maybe just using Nim or some other language is a better idea?
I'm not quite seeing the contradiction either? I sort of get that you're pointing out some kind of tension, but it's not obvious that there's a contradiction. The statements involved don't seem to be interpretable in a self-contained way.
I had understood that the only reason for RPython's existence was that bootstrapping was (or at least seemed) impossible without it... ? Although I didn't dig into that claim, either.
If in the end we can just have .spy on some files that have performance critical functions in them and the rest is just normal python, this could be down right amazing.
We recently swapped out mypyc optimised module for a rust implementation to get a 2-6x speed up, and not having to do that would be great.
Neat idea! Author’s ideas about different subsets of Python are worth the price of admission. What you can express in the type system, what performs well under JIT, what’s basically same and reasonable, may not be precisely specified, but are still useful and distinct ideas.
Common Lisp doesn't use (expensive) CLOS dispatch in the core language, e.g. to add two numbers or find the right equality operator. That's one known pain point due to CLOS having been "bolted-on" rather than part of the language which makes the divide between internal (using typecase and similar) and external (generic functions) dispatch pretty ugly; and gave use the eql/equal/equalp/etc... hell.
Julia is always the odd one out, when talking about dynamic vs. static dispatch, because its JIT compiler acts more like an Ahead-of-Time compiler in many regards.
In the best case, types are statically decidable and Julia's compiler just produces a static dispatch and native code like e.g. a C compiler would.
In the worst case, there are a big (or unlimited) number of type candidates.
The grey area in between, where there are a limited number of type candidates, is interesting. As far as I understand, Julia does something similar to the link you provided. Based on some heuristics it will compile instances for a "sealed" number of candidates and fallback to a fully dynamic dispatch, if there are two many type candidates.
For the worst case scenario, Julia chooses what's in my regard the nuclear option: If the types are not decidable, it just ships the whole compiler with your code and tries again at runtime.
But I guess, that's not the only possible solution. Presumably, it would also be possible to fallback to a Julia interpreter for dynamic code. That would be more similar to what JavaScript is doing, just the other way around. Instead of interpreting the majority if the code and optimising hot paths with a JIT, our alternative Julia would compile most code statically and use the interpreter for the dynamic parts.
You don't like those? I've always considered them a fairly elegant deconstruction of the problem domain of equality checking. DWIM languages can get very confusing when they DWIM or don't DWIM.
Common Lisp is not a runtime, it’s a specification. Implementations are free to compile everything to fast native code, or to interpret everything. Various available implementations do that and everything in between. That said , SBCL and the commercial implementations can be extremely fast, especially if you specify types on tight loops. SBCL comes with a disassembler that shows you right in the REPL the Assembly a function compiles to so you can even get close to C performance.
Just like Smalltalk and SELF, also Lisp Machines and Interlisp-D.
Usually comes down from a urban myth that Python is special and there was no other dynamic language before it came to be.
The JIT research on those platforms is what gave us leading JIT capabilities on modern runtimes, OpenJDK HotSpot traces back to Smalltalk and StrongTalk, while V8 traces back to SELF.
Especially in Smalltalk and SELF, you can change anything at any time across the whole image, and have the JIT pick up on that and re-optimize.
Granted what messes up Python, or better said CPython implemenation, is that C extensions are allowed to mess up with its internals thus making void many possible optimizations that would be otherwise available.
A reason why JVM, CLR, V8, ART make use of handles and have marshaling layers not allowing such kind of liberties with native extensions.
> Granted what messes up Python, or better said CPython implemenation, is that C extensions are allowed to mess up with its internals thus making void many possible optimizations that would be otherwise available.
I'm doing a Ruby compiler (very, very slowly, though faster again now with Claude Code doing most of the heavy lifting), and the same issue with Ruby has made me seriously toy with the idea of one day embedding a C compiler so that the Ruby compiler can optimise across both (it'd still have to deal with linking to third party C code, of course, which is one reason I'm hesitating) as a simple not-so-optimizing C compiler is like a trivial toy compared to compiling Ruby where just the parser makes you want to claw your eyes out, but it'd at least widen the surface a bit.
Great explanation. Five years ago I did the genealogical work to discover that StrongTalk begat HotSpot (by virtue of having some of the same authors) It was quite a joy to discover!
The problem with this is that the main value of Python is its ecosystem. SPy aims to be able to import Python libraries, but also not implement all Python features. If you are not 100% compatible how can you reliably import libraries?
SPy seems most likely to be more likely to be appealing as a more Pythonic alternative to Cython rather than a Python replacement.
hello, author of the blog post and author of SPy here.
> how can you reliably import libraries?
the blog post specifies it but probably not in great level of detail. Calling python libs from spy will go through libpython.so (so essentially we will embed CPython). So CPython will import the library, and there will be a SPy<=>CPython interop layer to convert/proxy objects on the two worlds.
I did a similar project, a typed perl. cperl. I could import most the modules, and did add types to some of the important modules. Eg testing was 2x faster. I needed typing patches for about 10% for most CPAN packages.
While obviously this would need CPython to SPy `import` support for it to displace CPython for me, it seems like you can `import` SPy to CPython, which makes it an attractive solution for implementing fast libraries where Rust is perhaps too heavy duty.
I'd imagine a lot of packages that you may want to use make deep use of some of these obscure features. So much magical "it just works" of Django is surely various kinds of deep introspection.
Not sure an AI can fix it yet. It's not just adding type annotations.
The position I take is that such obscure code in the guts of a popular package could be slowing down large amounts of deployed code elsewhere. If such code must exist, it should be marked as special (like how Cython does it).
Beyond adding type annotations, there are other important problems to solve when translating python to rust (the most popular path in py2many so far).
This is why I've urged FastAPI and pydantic maintainers to give up on BaseModel and use fquery.pydantic/fquery.sqlmodel decorators. They translate much better.
Based on my understanding, Mojo aims to make number crunch computation faster (GPU), while as SPy aims to make generic Python application logic faster. Very similar, but different sweet spots and use cases.
This is really cool and what I thought mojo would be. The subset of Python that is easy to use, read, and removes the dynamic nature that we don't use.
This looks like a very interesting approach bringing comptime to a static version of python. This version of comptime can then be used to define new types in the same way Zig does it.
I absolutely hate the terminology though red/blue redshifting etc. Why do blue functions disappear when redshifting? If you red shift blue then it goes down in frequency so you might get green or red. Perhaps my physics brain is just over thinking it!
> The other fundamental concept in SPy is redshifting.
> Each expression is given a color:
> blue expressions are those which can safely be evaluated ahead of time, because they don't have side effects and all operands are statically known.
> red expressions are those which needs to be evaluated at runtime.
> During redshifting we eagerly evaluate all the blue parts of the code: it's a form of partial evaluation. This process plays very well with the freezing that we discussed above, because a lot of operations on frozen data become automatically blue: for example, if we statically know the type of an object, the logic to look up a method inside the frozen class hierarchy is a blue operation and it's optimized away, leaving just a direct call as a result.
Please just rename it comptime then at least people who have learnt Zig will know what it means immediately.
In FORTH these would have been called IMMEDIATE words. Namely functions which run at "compile" time rather than run time.
> Why do blue functions disappear when redshifting? If you red shift blue then it goes down in frequency so you might get green or red. Perhaps my physics brain is just over thinking it!
yes I think you are overthinking :). It's not meant to be accurate physics of course.
The usage of colors to distinguish comptime vs runtim code comes from PyPy: in that context, we used "green" and "red", and initial versions of SPy used the same convention.
Then someone pointed out that green/red is not colorblind friendly and so I changed it to blue.
Having actual colors for the two phases is VERY useful for visualization: e.g. we already have a "spy --colorize" command which shows you which parts are blue and which are red.
As for "redshifting": the AST "before" has a mixture of blue and red colors, while the AST "after" has only red nodes, thus the final AST is "more red" than the first one, that's why I chose that name.
I believe that Python is as popular and widely used as it is because it's old enough to have an expansive ecosystem of libraries. It's easy enough to implement one in pure Python and possible to optimize it later (Pydantic is a great recent-ish example, switching to a Rust core for 2.0).
That same combination of Python + (choose a compiled language) makes it quite difficult for any new language to tap into the main strength of Python.
It's not just its age, it's how easy it is (was?) to jump in and start writing useful code that could be revisited later on and be able to read it and understand it again.
All of these efforts to turn it into another Typescript are going to, in the end, kill the ease of use it has always had.
F# has a similar whitespace syntax to Python, but is statically typed and can be compiled AoT.
Bubble sort Python:
mylist = [64, 34, 25, 12, 22, 11, 90, 5]
n = len(mylist)
for i in range(n-1):
for j in range(n-i-1):
if mylist[j] > mylist[j+1]:
mylist[j], mylist[j+1] = mylist[j+1], mylist[j]
print(mylist)
Bubble sort F#:
let mylist = ResizeArray [ 64; 34; 25; 12; 22; 11; 90; 5 ]
let n = Seq.length mylist
for i = 0 to n - 2 do
for j = 0 to n - i - 2 do
if mylist[j] > mylist[j + 1] then
let temp = mylist[j]
mylist[j] <- mylist[j + 1]
mylist[j + 1] <- temp
printfn "%A" mylist
var mylist = [64, 34, 25, 12, 22, 11, 90, 5]
let n = mylist.len
for i in 0..n-2:
for j in 0..n-i-2:
if mylist[j] > mylist[j + 1]:
swap(mylist[j], mylist[j + 1])
echo mylist
Nim feels like a really amazing language.
There were some minor things that I wanted to do with it. Like trying to solve a codeforces question just out of mere curiosity to build something on top of it.
I felt like although it was similar to python. You can't underestimate the python's standard library features which I felt lacking. I am not sure if these were skill issues. Yes these are similar languages but I would still say that I really welcome a language like SPy too.
The funny thing is that I ended up architecting a really complicated solution to a simple problem in nim and I was proud of it and then I asked chatgpt thinking no way there can be anything simpler for it in nim and I found something that worked in 7-10 or 12* lines and my jaw dropped lol. Maybe chatgpt could be decent to learn nim imo or reading some nim books for sure but the packages environment etc. felt really brittle as well.
I think that there are good features of both nim and SPy and I welcome both personally.
Yes, it's mature, but you (and your potential audience) basically need to learn a new language, a lot of quirks and "weird" (I'd even say counter-intuitive) nuances, and it's also significantly less readable in comparison with strict and typed Python. Even its modern syntax doesn't click immediately (also performance wise the new syntax somehow is a bit slower in my tests)
I am by no means a Cython fanboy but I think you are exaggerating the syntactic differences and readability differences.
Apart from type annotation they are very minor, well worth the speed benefits and type-error benefits. Given that we are discussing it in the context of SPy, SPy is not fully compatible with Python either, which is quite understandable and in my opinion a Good trade-off.
The benchmarking features are great, interactivity with C libraries is great.
One annoyance I have with Cython though is debuggability. But it's an annoyance, not a show-stopper.
> SPy is not a "compiler for Python". There are features of the Python language which will never be supported by SPy by design. Don't expect to compile Django or FastAPI with SPy.
i thought this was super cool and informative thank you so much for writing it.
i work a lot in python and tried to build a runtime type checker for it that basically acted as automatic assertions on all stated types at function boundaries, and it worked really well to bring all TypeErrors in my code to the surface on first run.
but, my knowledge is woefully short on compilers and languages. this definitely gave me a great frame of understanding, so thanks again for writing it
This assertion is so vague as to be meaningless - say more? You're presumably not asserting that all statically typed languages are fundamentally Java, because that would be saying the patently false "C and Haskell are fundamentally Java". Are you specifically saying the article's hope that it's not "Java with Python syntax" is misplaced; if so, why?
I'm still looking for a python-ish (at least the syntax) embeddable library for C that is easy to include in a project, a library that does not have its own build system with dozens of files... but I don't think one exists.
Overall this looks very interesting - I always thought more could have been done with RPython, but there was never any documentation for it. I do have some nits though:
> The following silly example happily passes mypy
In fairness that's because Mypy is shit. Pyright catches this mistake.
> As such, we need to treat Python type checkers more like linters than actual theorem provers
Again I think this is true of Mypy but Pyright is much closer to a sound type checker.
> redshifting
This is just constant propagation isn't it? C compilers have been doing this for decades. I don't think we need a silly new term for it. Let's not cause another "tree shaking" situation.
> So far, this is not different than usual constant folding, with the difference that it's guaranteed to happen. What makes it more powerful is the ability to mark some functions as @blue.
That's not different. It's just constant folding with C++'s `consteval`. And @blue is an absolutely abysmal name.
It would be much clearer if @blue were changed to @consteval or @comptime (either's good I think), and you just call it "compile time evaluation" or "constant propagation" instead of "redshifting".
I sympathise with the motivations for this, though I don't use Python much. I occasionally work on a toy Ruby compiler that started as a long blog series. More recently I've picked it up again with heavy AI use - I set Claude working on improving Rubyspec pass rates (which are atrocious). It's chugging along right now, actually.
One of the things I've spent a lot of time thinking about are the ways to avoid a lot of the dynamic features of Ruby without affecting actual, real code much.
There's a lot that can be done there - e.g. all of the research on Self and JS VMs is highly applicable.
But I say "real code" because a lot of the "worst" dynamic features of Ruby (and Python) either doesn't appear in production code very often, or at all (there are still aspects of Ruby I have never seen used in real-life use despite having used Ruby for 20 years), or could be mitigated trivially, so I still believe you can do quite decently without a lot of the more complex optimisations.
As an example (from Ruby): You can re-open the Integer class and override +:
But nobody does that. The problem with a lot of these features isn't that people use them, but that people might use them.
That leaves two main avenues: We can do like the author of thise post and strip away the costly features that are rarely used.
Or we can provide ways of promising not to use them through code or options.
The first option is perfectly valid, but I quite like the second:
In Ruby, it turns out a lot of the optimisation challenges goes away if you get an app to freeze the most important system classes after setup, because even most of the most horrific examples of Ruby monkeypatching tends to do most of it only during startup, and you then tend to get to a stable state where you can let applications opt in to additional optimisations just by calling "freeze" on a number of objects.
Ruby programs will also do things like dynamically decide which files to load based on reading the directory, but if you compile an application, most of the time you want that to happen ahead of time with a few exceptions (e.g. plugins), and so similarly, if you freeze as many classes as possible at a given point, you can partially evaluate manipulation of the runtime up until that point, and treat it as mostly static afterward, and fall back to slow paths for anything you can't statically resolve the names of, and still end up with lots of optimisation potential for most of the low level code.
I think a lot of the same would work for Python, and might bridge the gap between the categories of alternative implementations the author mentions with more predictability than relying on a JIT doing the right analysis.
E.g. your compiler can eat least potentially guarantee under which circumstances it can statically determine that an Integer can inline the fast path if Integer is frozen so that you can in fact reason about the code.
It's still pretty confusing: Uunchecking the box doesn't seem to do much (is it actually unchecked when you click it? There's still a checkmark); you still have to click Accept to see the text; what are you accepting?
In any case, pre-checked boxes are not valid consent under GDPR (“Planet49”).
It's weird, but the pain of moving to a whole new language seems less than the pain of using one of these language subsets. I guess it might be because of "loss aversion": we dislike losing something we have more the value we put on gaining it. In a new language each newly added feature is a gain, but in a language subset you're always encountering things you would have in the full language. So no matter how good the rational case is, there is a bias against them. I can't think of a successful one actually - anyone?
The author seems to speak to that here:
> SPy does something different: on one hand, it removes the dynamic features which make Python "slow", but on the other hand it introduces new features which make it possible to implement and use the same pythonic patterns which we like.
The author seems unusually focused on explaining his new way, which may help it feel more like a new language with a coherent philosophy and unique benefits in its own right.
One of the standard architectures for complex applications is to put together a scripting language and an systems programming language. You can either look at the scripting language as primary with the systems language used for performance or hardware interfacing or you can look at the systems language being primary and the scripting being used around the edges.
The model of having a closely related scripting and systems language would be an optimization of this and SPy seems like an answer.
[1] AAA games that have a scripting engine on top of C/C++, a finite element solver that lets you set up a problem with Lua and solve it with FORTRAN, etc.
Yes, there are quite many real-world cases of this architecure. But, wouldn't it be better it the same language (more or less) can be used for both? I don't think such a language exists currently, but I think it would be a nice goal.
1 reply →
> in a language subset you're always encountering things you would have in the full language
Exactly. I once pitched the idea of a Python subset (with a different focus, not performance like SPy), and almost every reaction was "will it support <favourite Python feature / favourite library>".
For example, a new language can build its own solution for array math, or maybe that's not something its users need. OTOH many consider a Python subset to be unacceptable if it doesn't specifically support NumPy.
In the end I came to agree with https://pointersgonewild.com/2024/04/20/the-alternative-impl...:
> positioning your project as an alternative implementation of something is a losing proposition
> don't go trying to create a subset of Python [...] Do your own thing. That way, you can evolve your system at your own pace and in your own direction, without being chained by expectations that your language should have to match the performance, feature set, or library ecosystem of another implementation.
I can see the wisdom in this.
As a counter-example, it feels as though Typescript has managed to (largely?) succeed as a subset of Javascript, so maybe the "do your own thing" isn't entirely a lost-cause -- maybe it's just really really difficult?
2 replies →
I would say that Rpython is successful. However, it's primary goal is to be whatever the PyPy developers need to write PyPy.
I totally agree about subsets though, I would much rather have a superset like Cython, but that has it's own challenges for compatibility if you wanted to have a "pure python" version of the same library. Which is why I really like that Cython is using typehints now.
Yeah Rpython escapes this as most people don't write it directly.
Micropython is another example. No-one expects micropython to support numpy. They're just happy to get away from C. But where you can use the full python, you wouldn't use micropython.
Throwing my hands up and moving to Nim was downright easy next to the excessive effort I put into trying out Nuitka, Numba, and PyInstaller for my use case. If you want static compilation, use a language and libraries built with that assumption as a ground rule. The herculean effort of building a half-compatible compiler for a dynamic language seems like a fool's errand, and would be a fun curiosity if so many people hadn't already tried it, especially with Python.
I was looking for someone else that had done this, I had the same exact experience.
That said, anyone looking into a completely static typed language that has nice ergonomics, is easy to pick up but has enough depth to keep you busy for weeks on end, and is versatile enough to be used for anything, do yourself a favor and give Nim a try.
https://nim-lang.org/
It's so common for something like this to struggle for decades before succeeding.
For instance people have been trying to make a memory safe C for a long time, just recently we got
https://fil-c.org/
which has high compatibility and relatively good performance for that kind of thing. The strength of Python is that so many people are trying things with it.
all of this is well and good if you completely forget that there are billions of lines of Python in prod right now. so your grand epiphany is basically on the level of "let's rewrite it in Rust". i'll let you hold your breath until that rewrite is done (and in the meantime i'll explore workarounds).
3 replies →
Looks very interesting!
I remember chatting with one of the creators of PyPy (not the author of TFA) a number of years ago at HPI. He had just given a talk about how RPython was used in PyPy development, and I was fascinated.
To me, it seemed completely obvious that RPython itself seemed like a really interesting standalone language, but he would have none of it.
Whenever I suggested that RPython might have advantages over PyPy he insisted that PyPy was better and, more strangely, just as fast. Which was sort of puzzling, because the reason given for RPython was speed. When I then suggested that they could (after bootstrap) just use PyPy without the need for RPython, he insisted that PyPy was too slow for that to be feasible.
The fact that both of these statements could not really be true at the same time did not register.
I have asked about using RPython as a generic standalone language before. I think the official statement is that is was never intended to become one, and it's really a very minimal subset of Python (so basically no existing Python code will run, it would require heavy refactoring or complete rewrite), and it's only specifically those features that they currently need, and it might also be a moving target, and they don't want to give certain guarantees on stability of the language etc.
Once you consider that you anyway need to write very different kind of code for RPython, then maybe just using Nim or some other language is a better idea?
I'm not quite seeing the contradiction either? I sort of get that you're pointing out some kind of tension, but it's not obvious that there's a contradiction. The statements involved don't seem to be interpretable in a self-contained way.
I had understood that the only reason for RPython's existence was that bootstrapping was (or at least seemed) impossible without it... ? Although I didn't dig into that claim, either.
If in the end we can just have .spy on some files that have performance critical functions in them and the rest is just normal python, this could be down right amazing.
We recently swapped out mypyc optimised module for a rust implementation to get a 2-6x speed up, and not having to do that would be great.
Antonio Cuni gave a great talk about SPy at EuroPython 2024: https://ep2024.europython.eu/session/spy-static-python-lang-...
Neat idea! Author’s ideas about different subsets of Python are worth the price of admission. What you can express in the type system, what performs well under JIT, what’s basically same and reasonable, may not be precisely specified, but are still useful and distinct ideas.
Common Lisp also allows you to redefine everything at runtime but doesn't suffer from the same performance issues that Python has, does it?
Doe anyone have insight into this?
Common Lisp doesn't use (expensive) CLOS dispatch in the core language, e.g. to add two numbers or find the right equality operator. That's one known pain point due to CLOS having been "bolted-on" rather than part of the language which makes the divide between internal (using typecase and similar) and external (generic functions) dispatch pretty ugly; and gave use the eql/equal/equalp/etc... hell.
Thing is that you need a complex JIT like Julia's or stuff like https://github.com/marcoheisig/fast-generic-functions to offset the cost of constant dynamic dispatch.
I actually had such a conversation on that comparison earlier this year: https://lwn.net/Articles/1032617/
> Thing is that you need a complex JIT like Julia's or stuff like https://github.com/marcoheisig/fast-generic-functions to offset the cost of constant dynamic dispatch.
Julia is always the odd one out, when talking about dynamic vs. static dispatch, because its JIT compiler acts more like an Ahead-of-Time compiler in many regards.
In the best case, types are statically decidable and Julia's compiler just produces a static dispatch and native code like e.g. a C compiler would.
In the worst case, there are a big (or unlimited) number of type candidates.
The grey area in between, where there are a limited number of type candidates, is interesting. As far as I understand, Julia does something similar to the link you provided. Based on some heuristics it will compile instances for a "sealed" number of candidates and fallback to a fully dynamic dispatch, if there are two many type candidates.
At JuliaCon 2025 there was an interesting talk about this topic: https://m.youtube.com/watch?v=iuq534UDvR4&list=PLP8iPy9hna6S...
For the worst case scenario, Julia chooses what's in my regard the nuclear option: If the types are not decidable, it just ships the whole compiler with your code and tries again at runtime. But I guess, that's not the only possible solution. Presumably, it would also be possible to fallback to a Julia interpreter for dynamic code. That would be more similar to what JavaScript is doing, just the other way around. Instead of interpreting the majority if the code and optimising hot paths with a JIT, our alternative Julia would compile most code statically and use the interpreter for the dynamic parts.
> and gave use the eql/equal/equalp/etc... hell.
You don't like those? I've always considered them a fairly elegant deconstruction of the problem domain of equality checking. DWIM languages can get very confusing when they DWIM or don't DWIM.
1 reply →
What is CLOS in this context?
2 replies →
Common Lisp is not a runtime, it’s a specification. Implementations are free to compile everything to fast native code, or to interpret everything. Various available implementations do that and everything in between. That said , SBCL and the commercial implementations can be extremely fast, especially if you specify types on tight loops. SBCL comes with a disassembler that shows you right in the REPL the Assembly a function compiles to so you can even get close to C performance.
Addedum that having a disassembler is quite common language primitive in most compiled Lisps, since early days.
Just like Smalltalk and SELF, also Lisp Machines and Interlisp-D.
Usually comes down from a urban myth that Python is special and there was no other dynamic language before it came to be.
The JIT research on those platforms is what gave us leading JIT capabilities on modern runtimes, OpenJDK HotSpot traces back to Smalltalk and StrongTalk, while V8 traces back to SELF.
Especially in Smalltalk and SELF, you can change anything at any time across the whole image, and have the JIT pick up on that and re-optimize.
Granted what messes up Python, or better said CPython implemenation, is that C extensions are allowed to mess up with its internals thus making void many possible optimizations that would be otherwise available.
A reason why JVM, CLR, V8, ART make use of handles and have marshaling layers not allowing such kind of liberties with native extensions.
> Granted what messes up Python, or better said CPython implemenation, is that C extensions are allowed to mess up with its internals thus making void many possible optimizations that would be otherwise available.
I'm doing a Ruby compiler (very, very slowly, though faster again now with Claude Code doing most of the heavy lifting), and the same issue with Ruby has made me seriously toy with the idea of one day embedding a C compiler so that the Ruby compiler can optimise across both (it'd still have to deal with linking to third party C code, of course, which is one reason I'm hesitating) as a simple not-so-optimizing C compiler is like a trivial toy compared to compiling Ruby where just the parser makes you want to claw your eyes out, but it'd at least widen the surface a bit.
1 reply →
Great explanation. Five years ago I did the genealogical work to discover that StrongTalk begat HotSpot (by virtue of having some of the same authors) It was quite a joy to discover!
The problem with this is that the main value of Python is its ecosystem. SPy aims to be able to import Python libraries, but also not implement all Python features. If you are not 100% compatible how can you reliably import libraries?
SPy seems most likely to be more likely to be appealing as a more Pythonic alternative to Cython rather than a Python replacement.
hello, author of the blog post and author of SPy here.
> how can you reliably import libraries?
the blog post specifies it but probably not in great level of detail. Calling python libs from spy will go through libpython.so (so essentially we will embed CPython). So CPython will import the library, and there will be a SPy<=>CPython interop layer to convert/proxy objects on the two worlds.
Thanks for the answer. I have to admit I missed the implications of embedding libpython. Sounds great.
I did a similar project, a typed perl. cperl. I could import most the modules, and did add types to some of the important modules. Eg testing was 2x faster. I needed typing patches for about 10% for most CPAN packages.
A type is a contract, not a hint!
> A type is a contract, not a hint!
In Python it is a hint.
2 replies →
While obviously this would need CPython to SPy `import` support for it to displace CPython for me, it seems like you can `import` SPy to CPython, which makes it an attractive solution for implementing fast libraries where Rust is perhaps too heavy duty.
Seems correct. See point.spy in the playground here: https://spylang.github.io/spy/
# - the unsafe module allows C-level direct memory access to pointers and unsafe arrays
# - @struct maps directly to C structs
# - most users will never have to deal with this directly: using the unsafe` module is the equivalent of writing C extensions or using Cython
++spy ethos and ideas
But why limit to an interpreter? Translate to other excellent compiled languages and benefit from the optimization work there.
Giving up on C-API and the dynamic parts of python that 1% of the people use is a good trade-off.
In the age of cursor and windsurf it's not hard to auto replace incompatible code with something that works in the static-py ecosystem.
Would love to participate in an effort to standardize such a subset.
I'd imagine a lot of packages that you may want to use make deep use of some of these obscure features. So much magical "it just works" of Django is surely various kinds of deep introspection.
Not sure an AI can fix it yet. It's not just adding type annotations.
The position I take is that such obscure code in the guts of a popular package could be slowing down large amounts of deployed code elsewhere. If such code must exist, it should be marked as special (like how Cython does it).
Beyond adding type annotations, there are other important problems to solve when translating python to rust (the most popular path in py2many so far).
This is why I've urged FastAPI and pydantic maintainers to give up on BaseModel and use fquery.pydantic/fquery.sqlmodel decorators. They translate much better.
There is a compiler detailed in the page on the link;
> 3. We have a compiler for deployment and performance. The interpreter and the compiler are guaranteed to produce the exact same results at runtime.
Where is it? Would love to compare the approach to py2many.
1 reply →
This seems to be going for a somewhat similar goal to Mojo [0] - anyone here who used both and is willing to offer a comparison?
[0] https://www.modular.com/mojo
Time for me to remind everyone of the Shedskin Python compiler.
https://shedskin.github.io/
Based on my understanding, Mojo aims to make number crunch computation faster (GPU), while as SPy aims to make generic Python application logic faster. Very similar, but different sweet spots and use cases.
While GPU is a focus of Mojo, it is also planned to make it a general system programming language similar to C++ and Rust.
Your understanding of mojo is incomplete. Just visiting their website would have cleared that up.
Reminds me of Shed Skin, but no mention of it. I wonder how they compare.
https://en.wikipedia.org/wiki/Shed_Skin
The feels like JS -> TS type of change on JS side, except that was done with transpiler (sp ? ).
This is really cool and what I thought mojo would be. The subset of Python that is easy to use, read, and removes the dynamic nature that we don't use.
Excited to see where this goes.
This looks like a very interesting approach bringing comptime to a static version of python. This version of comptime can then be used to define new types in the same way Zig does it.
I absolutely hate the terminology though red/blue redshifting etc. Why do blue functions disappear when redshifting? If you red shift blue then it goes down in frequency so you might get green or red. Perhaps my physics brain is just over thinking it!
> The other fundamental concept in SPy is redshifting.
> Each expression is given a color:
> blue expressions are those which can safely be evaluated ahead of time, because they don't have side effects and all operands are statically known.
> red expressions are those which needs to be evaluated at runtime.
> During redshifting we eagerly evaluate all the blue parts of the code: it's a form of partial evaluation. This process plays very well with the freezing that we discussed above, because a lot of operations on frozen data become automatically blue: for example, if we statically know the type of an object, the logic to look up a method inside the frozen class hierarchy is a blue operation and it's optimized away, leaving just a direct call as a result.
Please just rename it comptime then at least people who have learnt Zig will know what it means immediately.
In FORTH these would have been called IMMEDIATE words. Namely functions which run at "compile" time rather than run time.
hello, spy author here.
> Why do blue functions disappear when redshifting? If you red shift blue then it goes down in frequency so you might get green or red. Perhaps my physics brain is just over thinking it!
yes I think you are overthinking :). It's not meant to be accurate physics of course.
The usage of colors to distinguish comptime vs runtim code comes from PyPy: in that context, we used "green" and "red", and initial versions of SPy used the same convention.
Then someone pointed out that green/red is not colorblind friendly and so I changed it to blue.
Having actual colors for the two phases is VERY useful for visualization: e.g. we already have a "spy --colorize" command which shows you which parts are blue and which are red.
As for "redshifting": the AST "before" has a mixture of blue and red colors, while the AST "after" has only red nodes, thus the final AST is "more red" than the first one, that's why I chose that name.
I like the idea of a compiled language that takes the look and ethos of Python (or at least the "looks like pseudocode, but runs"-ethos)
I don't think the article gives much of an impression on how SPy is on that front.
I believe that Python is as popular and widely used as it is because it's old enough to have an expansive ecosystem of libraries. It's easy enough to implement one in pure Python and possible to optimize it later (Pydantic is a great recent-ish example, switching to a Rust core for 2.0). That same combination of Python + (choose a compiled language) makes it quite difficult for any new language to tap into the main strength of Python.
It's not just its age, it's how easy it is (was?) to jump in and start writing useful code that could be revisited later on and be able to read it and understand it again.
All of these efforts to turn it into another Typescript are going to, in the end, kill the ease of use it has always had.
This is what F# provides.
F# has a similar whitespace syntax to Python, but is statically typed and can be compiled AoT.
Bubble sort Python:
Bubble sort F#:
Nim:
You can have that today with Nim.
Nim feels like a really amazing language. There were some minor things that I wanted to do with it. Like trying to solve a codeforces question just out of mere curiosity to build something on top of it.
I felt like although it was similar to python. You can't underestimate the python's standard library features which I felt lacking. I am not sure if these were skill issues. Yes these are similar languages but I would still say that I really welcome a language like SPy too.
The funny thing is that I ended up architecting a really complicated solution to a simple problem in nim and I was proud of it and then I asked chatgpt thinking no way there can be anything simpler for it in nim and I found something that worked in 7-10 or 12* lines and my jaw dropped lol. Maybe chatgpt could be decent to learn nim imo or reading some nim books for sure but the packages environment etc. felt really brittle as well.
I think that there are good features of both nim and SPy and I welcome both personally.
1 reply →
There don't seem to be great web frameworks like Flask, Django, or FastAPI for Nim.
1 reply →
If you want different parts of your code to be a statically typed Python lookalike Cython is a mature option
Yes, it's mature, but you (and your potential audience) basically need to learn a new language, a lot of quirks and "weird" (I'd even say counter-intuitive) nuances, and it's also significantly less readable in comparison with strict and typed Python. Even its modern syntax doesn't click immediately (also performance wise the new syntax somehow is a bit slower in my tests)
I am by no means a Cython fanboy but I think you are exaggerating the syntactic differences and readability differences.
Apart from type annotation they are very minor, well worth the speed benefits and type-error benefits. Given that we are discussing it in the context of SPy, SPy is not fully compatible with Python either, which is quite understandable and in my opinion a Good trade-off.
The benchmarking features are great, interactivity with C libraries is great.
One annoyance I have with Cython though is debuggability. But it's an annoyance, not a show-stopper.
Have used in production without problems.
> SPy is [...] a compiler
> SPy is not a "compiler for Python"
I think it's funny how it's confusing from the first paragraph
Reading the next sentence clears the confusion:
> SPy is not a "compiler for Python". There are features of the Python language which will never be supported by SPy by design. Don't expect to compile Django or FastAPI with SPy.
Yeah but then don't say that SPy is a (interpreter and) compiler in the first place? Just say it's a interpreter.
2 replies →
To make it more confusing: SPy is not spyware (at least, I hope)
i thought this was super cool and informative thank you so much for writing it.
i work a lot in python and tried to build a runtime type checker for it that basically acted as automatic assertions on all stated types at function boundaries, and it worked really well to bring all TypeErrors in my code to the surface on first run.
but, my knowledge is woefully short on compilers and languages. this definitely gave me a great frame of understanding, so thanks again for writing it
Statically typed. That's just Java with python syntax again, which the article tries to talk it's way out of but that's what it is.
This assertion is so vague as to be meaningless - say more? You're presumably not asserting that all statically typed languages are fundamentally Java, because that would be saying the patently false "C and Haskell are fundamentally Java". Are you specifically saying the article's hope that it's not "Java with Python syntax" is misplaced; if so, why?
Good level of detail (for me) to understand (some things).
I'm still looking for a python-ish (at least the syntax) embeddable library for C that is easy to include in a project, a library that does not have its own build system with dozens of files... but I don't think one exists.
Overall this looks very interesting - I always thought more could have been done with RPython, but there was never any documentation for it. I do have some nits though:
> The following silly example happily passes mypy
In fairness that's because Mypy is shit. Pyright catches this mistake.
> As such, we need to treat Python type checkers more like linters than actual theorem provers
Again I think this is true of Mypy but Pyright is much closer to a sound type checker.
> redshifting
This is just constant propagation isn't it? C compilers have been doing this for decades. I don't think we need a silly new term for it. Let's not cause another "tree shaking" situation.
> So far, this is not different than usual constant folding, with the difference that it's guaranteed to happen. What makes it more powerful is the ability to mark some functions as @blue.
That's not different. It's just constant folding with C++'s `consteval`. And @blue is an absolutely abysmal name.
It would be much clearer if @blue were changed to @consteval or @comptime (either's good I think), and you just call it "compile time evaluation" or "constant propagation" instead of "redshifting".
I sympathise with the motivations for this, though I don't use Python much. I occasionally work on a toy Ruby compiler that started as a long blog series. More recently I've picked it up again with heavy AI use - I set Claude working on improving Rubyspec pass rates (which are atrocious). It's chugging along right now, actually.
One of the things I've spent a lot of time thinking about are the ways to avoid a lot of the dynamic features of Ruby without affecting actual, real code much.
There's a lot that can be done there - e.g. all of the research on Self and JS VMs is highly applicable.
But I say "real code" because a lot of the "worst" dynamic features of Ruby (and Python) either doesn't appear in production code very often, or at all (there are still aspects of Ruby I have never seen used in real-life use despite having used Ruby for 20 years), or could be mitigated trivially, so I still believe you can do quite decently without a lot of the more complex optimisations.
As an example (from Ruby): You can re-open the Integer class and override +:
(don't do this in IRB; it crashes)
But nobody does that. The problem with a lot of these features isn't that people use them, but that people might use them.
That leaves two main avenues: We can do like the author of thise post and strip away the costly features that are rarely used.
Or we can provide ways of promising not to use them through code or options.
The first option is perfectly valid, but I quite like the second:
In Ruby, it turns out a lot of the optimisation challenges goes away if you get an app to freeze the most important system classes after setup, because even most of the most horrific examples of Ruby monkeypatching tends to do most of it only during startup, and you then tend to get to a stable state where you can let applications opt in to additional optimisations just by calling "freeze" on a number of objects.
Ruby programs will also do things like dynamically decide which files to load based on reading the directory, but if you compile an application, most of the time you want that to happen ahead of time with a few exceptions (e.g. plugins), and so similarly, if you freeze as many classes as possible at a given point, you can partially evaluate manipulation of the runtime up until that point, and treat it as mostly static afterward, and fall back to slow paths for anything you can't statically resolve the names of, and still end up with lots of optimisation potential for most of the low level code.
I think a lot of the same would work for Python, and might bridge the gap between the categories of alternative implementations the author mentions with more predictability than relying on a JIT doing the right analysis.
E.g. your compiler can eat least potentially guarantee under which circumstances it can statically determine that an Integer can inline the fast path if Integer is frozen so that you can in fact reason about the code.
I can't view the site on my mobile without accepting cookies.
Specifically Google Analytics cookies, but I found you can uncheck the box.
It's still pretty confusing: Uunchecking the box doesn't seem to do much (is it actually unchecked when you click it? There's still a checkmark); you still have to click Accept to see the text; what are you accepting?
In any case, pre-checked boxes are not valid consent under GDPR (“Planet49”).
No cookie notice at all for me using Firefox on Android with the "I Still Don't Care About Cookies" extension.