← Back to context

Comment by imtringued

12 hours ago

As far as I know, static recompilation is thwarted by self modifying code (primarily JITs) and the ability to jump to arbitrary code locations at runtime.

The latter means that even in the absence of a JIT, you would need to achieve 100% code coverage (akin to unit testing or fuzzing) to perform static recompilation, otherwise you need to compile code at runtime at which point you're back to state of the art emulation with a JIT. The only real downside of JITs is the added latency similar to the lag induced by shader compilation, but this could be addressed by having a smart code cache instead. That code cache realistically only needs to store a trace of potential starting locations, then the JIT can compile the code before starting the game.

Yes, but in practice that isn't a problem. People do write self modifying code, and jump to random places today. However it is much less common today than in the past. IT is safe to say that most games are developed and run on the developers PC and then ported to the target system. If they know the target system they will make sure it works on the system from day one, but most developers are going to prefer to run their latest changes on their current system over sending it to the target system. If you really need to take advantage of the hardware you can't do this, but most games don't.

Many games are written in a high level language (like C...) which doesn't give you easy access to self modifying code. (even higher level languages like python do, but they are not compiled and so not part of this discussion). Likewise, jumping to arbitrary code is limited to function calls for most programmers.

Many games just run on a game engine, and the game engine is something we can port or rewrite to other systems and then enable running the game.

Be careful of the above: most games don't become popular. It is likely the "big ticket games" people are most interested in emulating had the development budget and need to take advantage of the hardware in the hard ways. That is the small minority of exceptions are the ones we care about the most.

  • This is PS2 emulation, where most engines were still bespoke and every hack in the book was still on the table.

I believe the main interest in recompilation is in using the recompiled source code as a base for modifications.

Otherwise, yeah, a normal emulator JIT basically points a recompiler at each jump target encountered at runtime, which avoids the static analysis problem. AFAIK translating small basic blocks and not the largest reachable set is actually desirable since you want frequent "stopping points" to support pausing, evaluating interrupts, save states, that kind of stuff, which you'd normally lose with a static recompiler.

JIT isn't _that_ common in games (although it is certainly present in some, even from the PS2 era), but self-modifying or even self-referencing executables were a quite common memory saving trick that lingered into the PS2 era - binaries that would swap different parts in and out of disk were quite common, and some developers kept using really old school space-saving tricks like reusing partial functions as code gadgets, although this was dying out by the PS2 era.

Emulation actually got easier after around the PS2 era because hardware got a little closer to commodity and console makers realized they would need to emulate their own consoles in the future and banned things like self-modifying code as policy (AFAIK, the PowerPC code segment on both PS3 and Xbox 360 is mapped read only; although I think SPE code could technically self-modify I'm not sure this was widespread)

The fundamental challenges in this style of recompilation are mostly offset jump tables and virtual dispatch / function pointer passing; this is usually handled with some kind of static analysis fixup pass to deal with jump tables and some kind of function boundary detection + symbol table to deal with virtual dispatch.

How many PS2-era games used JIT? I would be surprised if there were many of them - most games for the console were released between 2000 and 2006. JIT was still considered a fairly advanced and uncommon technology at the time.

  • I'd say practically none, we were quite memory starved most of the time and even regular scripting engines were a hard sell at times (perhaps more so due to GC rather than interpretation performance).

    Games on PS2 were C or C++ with some VU code (asm or some specialized hll) for most parts, often Lua(due to low memory usage) or similar scripting added for minor parts with bindings to native C/C++ functions.

    "Normal" self-modifying code went out of favour a few years earlier in the early-mid 90s, and was perhaps more useful on CPU's like the 6502s or X86's that had few registers so adjusting constants directly into inner-loops was useful (The PS2 MIPS cpu has plenty of registers, so no need for that).

    However by the mid/late 90s CPU's like the PPro already added penalties for self-modifying code so it was already frowned on, also PS2 era games already often ran with PC-versions side-by-side so you didn't want more than needed platform dependencies.

    Most PS2 performance tuning we did was around resources/memory, VU and helped by DMA-chains.

    Self modifying code might've been used for copy-protection but that's another issue.

  • A lot of PS2-era games unfortunately used various self-modifying executable tricks to swap code in and out of memory; Naughty Dog games are notorious for this. This got easier in the Xbox 360 and PS3 era where the vendors started banning self-modifying code as a matter of policy, probably because they recognized that they would need to emulate their own consoles in the future.

    The PS2 is one of the most deeply cursed game console architectures (VU1 -> GS pipeline, VU1 microcode, use of the PS1 processor as IOP, etc) so it will be interesting to see how far this gets.

    • Ah - so, not full-on runtime code generation, just runtime loading (with some associated code-mangling operations like applying relocations). That seems considerably more manageable than what I was thinking at first.

      1 reply →