A time-travelling door bug in Half Life 2

8 days ago (mastodon.gamedev.place)

Ah, nice story!

This reminds me of another story with FPU involved. I was a game developer once. We were making a game that consistently triggered assertion failures related to FPU calculations, but only on a single PC in the whole office. The game was explicitly setting FPU precision to 32 bits at the start to make all calculations more consistent. However, on that particular PC, there was a fancy hand writing input software that injected its DLL into every process. As you've probably already guessed, that DLL did FPU mode reset to the default in the event handling loop (i.e., main thread). I had to shift FPU mode setting code from process initialization to the event handling loop to be able to deal with the damage that third party DLLs could inflict.

  • nice detective work. Global FPU state had sure caused a lot of headaches.

    • I recall that D3D liked poking FPU state too, which of course had all sorts of fun results

Reminds me of what G-Man said in the opening scene of HL2: "The right man in the wrong place can make all the differences in the world"

Wait, so is that "beta" of Half Life 2 VR a thing I can play? If it is, how did I not know about this, and if not... why not?

I'd also love to play Portal, actually. They say it makes you sick, but to my knowledge I'm immune from VR motion sickness, so worth a try...

I have encountered at least one bug at $job which was tracked down to x87 instructions. Our "production" build is deployed on embedded ARM CPUs while we have test builds compiled for x86 (32-bit) and x86_64 (different subsets of functionality). Anyway, the bug only showed up in the 32-bit x86 build. The same code worked fine in production and in the 64-bit test builds.

It turned out to be an x87 bug where a piece of code was actually computing the wrong answer!. Logically following the code would make you think that the particular failure in question would never happen - and yet it did. That was quite a rabbit-hole to go into to figure out.

>a big innovation of HL2 was the extensive use of a real physics engine. The door and the guard are both physical objects, both have momentum, they impart an impulse on each other, and although the door hinge is frictionless, the guard's boots have some amount of friction with the floor.

It's been a while since I've played HL2 but this isn't exactly how I remember it. While a lot of things were physics objects I thought the doors would just smoothly rotate towards their target position without any physics at all. You can't bump them shut with another physics object for instance.

  • You can't move them (apart from the opening and closing animation), but they can move other objects that are in their way. Both need to be physics objects for that to work, even though the door is just kinematic (i.e. it won't react to forces applied to it). Although if I remember correctly, they are not even fully kinematic. I think you could get them stuck halfway closed by cramming something in the door frame that would get the whole thing jammed.

    • > I think you could get them stuck halfway closed by cramming something in the door frame that would get the whole thing jammed.

      This was a popular griefing tactic when TF2 first came out where you could trap everyone in spawn by crouch-jumping into the spawn door as Scout: https://youtu.be/JUPzN7tp7bQ?t=243

      1 reply →

  • Just did some quick testing - the doors definitely have physics and can get stuck on objects and can impart forces. But unimpeded yes, they smoothly open/close.

    I stuck a tire in a door frame and tried to close it, the tire emitted a bunch of dust clouds as the two objects fought before the door finally ejected the tire at high speed.

> The door and the guard are both physical objects, both have momentum, they impart an impulse on each other

I wonder if the term "impulse" here has any connection to the various impulse commands available in the source engine. I remember using "impulse 101" and causing havok in the opening plaza area. Spawning zombies on the roofs, sending them after the combine, etc.

https://developer.valvesoftware.com/wiki/Impulse

Nothing to do with the actual story which was very interesting, but I just find it so absurd that in 2025 we're still splitting posts into paragraphs because of some weird historical character limit on tweets.

And some of those posts are way longer than tweets used to be, but we're still splitting them for no real reason than to make them tweets instead of blog posts?

  • Worse is better, it's easier to write inline on your Fediverse account than to write a blog post somewhere else and syndicate it

It seems to be typical - some calculations break while switching from x87 to SSE. The same happened with TF2 too - it's ammo calculation code worked slightly differently on GNU/Linux build of the game, because it was built with SSE instructions (Windows version still used x87).

  • I think the only visible effect from that was the Engineer's metal, giving +40 or +41 from a small box, depending on the server platform (all classes technically do have metal, but the others can't use it).

    It was always fun to play on a new server and check what OS it was running that way, too. :-)

  • I expect this is / was a very common problem for people porting 32-bit game code to newer compilers. I work on a fairly old codebase that forces use of x87 for a handful of code paths that don't work correctly otherwise. GCC will use default to x87 if you do an i386 compile, but will default to SSE for 64-bit builds, so you have to be careful there too.

I used to work at the studio responsible for the Driver games, and few years back we dug out the code for the original PC Driver and tried to compile it again, mostly for fun - we had to change a lot of hand written assembly code to make it build, and discovered that yeah, the game worked but none of the game replays worked - and it was for that exact same reason, better/different floating point precision issues. Really fun thing to investigate though.

It's a goal of mine to get Valve using Nix. (I hope our in-progress Windows support would make this especially compelling.)

One advantage of this is that it will become very easy to not only build the original source of the game, but also build it with the original toolchain and dependencies, the toolchains for those dependencies, etc. etc., all the way down.

Hopefully something like that at your finger trips would have made finding the root cause of this bug a good bit easier!

  • > It's a goal of mine to get Valve using Nix

    They’re using Arch Linux. Let’s call it a win and move on lol.

    • Using completely different packaging infrastructure for the different platforms you support is no good!

      The goal with Nix should be that you can use the same infrastructure for all of Linux, macOS, and Windows. (And other Unixes, other OSes, etc. etc.)

  • > I hope our in-progress Windows support would make this especially compelling.

    What is the current story for using Nix to build Windows binaries?

    • On Nixpkgs master, you can cross compile to MinGW and Cygwin.

      Native Cygwin builds are also currently in progress: https://github.com/NixOS/nixpkgs/pull/447520. I would expect this to be done very soon, there year even (holidays permitting).

      There is a MinGW build of Nix but it is missing some features. There has been MSVC build of a fork in the past and I would like to revive that also.

      There some some open questions relating to Nixpkgs's heavy use of Bash, but longer term I would like to compete for Windows in all ways:

      - Support all cross (MinGW, Cygwin/MSYS2, MSVC ABI with LLVM, MSVC with Wine) - Be Cygwin packages - Be https://github.com/msys2/MINGW-packages / https://github.com/msys2/MSYS2-packages - Be VCPKG

      All these things have slightly different trade-offs, and Nixpkgs is very good at portability, so we should simply do them all.

  • Maybe I'm not seeing it. How would the bug finding be easier here? Seems like the same setup. They could compile with recent tools, and they already had the compiled version with old tools (hosted on Steam).

    • You could quickly rule out non-determinism by reproducing the build with the new and old tools.

      You could also try the newer version of the codebase with the older tools (assuming nothing broke / no newer C++ features) if you like.

I wonder how on earth stuff like x86->ARM translation works so well if games break even after switching from x87 registers to SSE preserving all the logic otherwise...

  • I think x87 fpu is the only 'weird' floating point units left. I think if you stick with 64-bit double precision floats or 32-bit single precision floats, where the registers are also 64 or 32 bits, all the modern stuff behaves the same. x87 is just weird because registers are 80-bits ... the idea was to have more accurate results from more precision, but it ends up weird because if you run out of registers and have to spill to memory, you typically lose precision.

    Edit: since this post was second chanced, I can add on that some of the pre-PC consoles have weird floats too. If they had floats at all. Lots of fun for emulation developers. Even fun for contemporaneous game developers... PilotWings on the SNES comes with different revision accelerator chips and the demo only works properly on the early revision chips (but I think? the later revision chips have more accurate math). The PS2 FPU has weirdness around NaN, Infinity, very large numbers, and denormalized numbers. Etc.

  • It's probably because you have to have weird precision issues where the numbers are calculated ever so slightly differently, and some other effect like a guard being slightly too close and getting clipped by a door where that difference matters.

    I debugged some software synthesizer code a while back (like 20 years or so now I think of it) where a build of it on one platform failed because of a precision bug. I can't remember the details, but there was a lot of "works fine on my machine" type discussion around it. Anyway it relied on a crude simulation of an RC circuit reaching very close to 0 asymptotically to trigger a state change, but on something like 64-bit Intel with a specific processor it never quite made it low enough to trip the comparison because of something to do with not flushing denormals.

    From an electronic standpoint, making it simulate "it's high enough" as being about 0.7 and " it's low enough" being about 0.01 was far closer to the instrument they were trying to simulate, and making it massively imprecise like that got it going on everything.

  • I remember there was a huge scandal where Intel's compiler, icc (considered to be the fastest for quite a while back when) defaulted to x87 when it detected an AMD CPU instead of SSE, giving AMD cpu's a handicap (incidentally, that's the reason why x87 used to be much faster on AMD for a while).

    A lot of games were shipped with icc, so my guess is they'd work just fine as they were tested with both.

  • Rosetta uses software emulation for x87 floating point. That's slow, but in practice that doesn't matter much. Mac software never had a reason to use x87 FP, every Intel Mac had at least SSE3 support.

    • There was at least one reason...

          long double x87me(long double a, long double b) {
              return a+b;
          }
      
          pushq %rbp
          movq %rsp, %rbp
          fldt 32(%rbp)
          fldt 16(%rbp)
          faddp %st(1)
          popq %rbp
          retq

      2 replies →

>But on the SSE version, a whole bunch of tiny precisions are very slightly different, and a combination of the friction on the floor and the mass of the objects means the guard still rotates from the collision, but now he rotates very slightly less far.

Insanity. The values were just right. Just wow.

Meta: I'm going to make a Twitter clone that bans you if you use it as a multi-paragraph blogging platform... god damn !#@#%!!!.