Comment by nasretdinov

3 months ago

I wonder how on earth stuff like x86->ARM translation works so well if games break even after switching from x87 registers to SSE preserving all the logic otherwise...

12 comments

nasretdinov

toast0 3 months ago

I think x87 fpu is the only 'weird' floating point units left. I think if you stick with 64-bit double precision floats or 32-bit single precision floats, where the registers are also 64 or 32 bits, all the modern stuff behaves the same. x87 is just weird because registers are 80-bits ... the idea was to have more accurate results from more precision, but it ends up weird because if you run out of registers and have to spill to memory, you typically lose precision.

Edit: since this post was second chanced, I can add on that some of the pre-PC consoles have weird floats too. If they had floats at all. Lots of fun for emulation developers. Even fun for contemporaneous game developers... PilotWings on the SNES comes with different revision accelerator chips and the demo only works properly on the early revision chips (but I think? the later revision chips have more accurate math). The PS2 FPU has weirdness around NaN, Infinity, very large numbers, and denormalized numbers. Etc.

kineticdaffodil 3 months ago

What about arm, with software floats its compiler depending?

ErroneousBosh 3 months ago

It's probably because you have to have weird precision issues where the numbers are calculated ever so slightly differently, and some other effect like a guard being slightly too close and getting clipped by a door where that difference matters.

I debugged some software synthesizer code a while back (like 20 years or so now I think of it) where a build of it on one platform failed because of a precision bug. I can't remember the details, but there was a lot of "works fine on my machine" type discussion around it. Anyway it relied on a crude simulation of an RC circuit reaching very close to 0 asymptotically to trigger a state change, but on something like 64-bit Intel with a specific processor it never quite made it low enough to trip the comparison because of something to do with not flushing denormals.

From an electronic standpoint, making it simulate "it's high enough" as being about 0.7 and " it's low enough" being about 0.01 was far closer to the instrument they were trying to simulate, and making it massively imprecise like that got it going on everything.

lomase 3 months ago
Is funny because the only code I have read that flushed denormals was in synth code.
- ErroneousBosh 3 months ago
  
  Denormals in audio code are kind of the "perfect storm", because they take ages to deal with - you're suddenly back into softfloat land - and because you have to deal with many thousands of them in a few hundred microseconds.
  We take how fast hardware floating point is for granted. I suspect it would be interesting to compare something compiled with softfloat with a normal benchmark and see just how bad it is.
  It's a great reason to do your DSP code in fixed-point, which is just integer with a couple of steps you have to write down on paper to keep straight until you get to the end. Or, I do, because I suck at arithmetic. Just do it all in machine-length signed ints, and forget all the mystical world of tiny tiny floating point values ;-)
  
  2 replies →

torginus 3 months ago

I remember there was a huge scandal where Intel's compiler, icc (considered to be the fastest for quite a while back when) defaulted to x87 when it detected an AMD CPU instead of SSE, giving AMD cpu's a handicap (incidentally, that's the reason why x87 used to be much faster on AMD for a while).

A lot of games were shipped with icc, so my guess is they'd work just fine as they were tested with both.

pdw 3 months ago

Rosetta uses software emulation for x87 floating point. That's slow, but in practice that doesn't matter much. Mac software never had a reason to use x87 FP, every Intel Mac had at least SSE3 support.

ksherlock 3 months ago

There was at least one reason...

    long double x87me(long double a, long double b) {
        return a+b;
    }

    pushq %rbp
    movq %rsp, %rbp
    fldt 32(%rbp)
    fldt 16(%rbp)
    faddp %st(1)
    popq %rbp
    retq

adastra22 3 months ago

what is this?

1 reply →