Comment by rossant
1 day ago
Am I the only one to be annoyed by this...?
while (this->m_fBladeAngle > 6.2831855) { this->m_fBladeAngle = this->m_fBladeAngle - 6.2831855; }
Like, "let's just write a while loop that could turn into an infinite loop coz I'm too lazy to do a division"
I want to assume that the GTA developers did this hack because it was faster than floating point division on the Playstation 2 or something.
But knowing they were able to they were able to blow up loading GTA5 by 5 minutes by just parsing json with sscanf, I don't have much hope.
IIRC the whole parsing performance issue was because the original code was written for the SP campaign of GTA5 that only had a handful of objects to parse data for. That was barely a blip in terms of performance impact and AFAIK was written years before GTAOnline was made (where it became an issue - and even then only became an issue much after GTAOnline was first made).
Writing some simple code that works with the data you expect to have without bothering with optimizations is fine, if anything it is one of the actual cases of "premature optimization": even with profiling no real time is spent on that code, your data wont make it spend any time and you should avoid wild guesses since chances are you'll be wrong (even if in this case it could be a correct guess, it'd be like a broken clock guessing the time is always 13:37).
The actual issue with that code was that, after they reused it for GTAOnline and started becoming a performance issue after some time as they added more objects, nobody thought to try and see what is wrong.
Are you actually arguing that using a JSON parser for JSON-formatted data is a premature optimization? The solution here was to use a different format, not a somewhat-JSON-compatible hacked together parser.
They were not the only one to make that mistake e.g. rapidjson had to fix the same error, few people expect parsing one token out of sscanf to strlen the entire input (not only that but there are c++ APIs which call sscanf under the hood).
The second error of deduplicating values by linear scanning an array was way more egregious.
The real, systemic error is that dozens(?) of engineers worked on that product, supposedly often testing the online component and experiencing that wait time first hand; and none thought "wait, parsing JSON doesn't take that long, computers are fast! what's going on?"
I think someone estimated that error cost them millions in revenue? I'm pretty sure a fraction of that could afford an engineer who knows how fast computers ought to be.
1 reply →
I'm willing to bet it was was done for performance reasons, subtraction is cheaper than float point division. Probably the compiler also has some tricks to optimize this further.
There is absolutely no way this could turn into an infinite loop. It could underflow, but for that to happen angle would have to be less than the 2*pi, therefore exiting the loop.
The article discusses how that turns into an infinite loop and causes a hang.
When you subtract a small float from a very large float, the value doesn't change. This is because the "steps" between float values increase with the size of the value (i.e. floats have coarser resolution for larger magnitudes)
To see this in action, try running the following in a JavaScript interpreter:
console.log(1_000_000_000_000_000_000 - 1);
But that’s “impossible”. It’s an angle between 0 and 2pi. When transformed it might go over a bit so they added the check.
It will “never” become big.
So why check? It’s unnecessary.
Thus the bug.
If m_fBladeAngle is really large (>2.2e8 back of the envelope), the subtraction will have no effect, and that would be an infinite loop.
Long shot, but maybe if the value is small, then this loop could be faster than division.
If the code runs every frame, it's probably always small and does just one iteration once in a while when it wraps over the value.
for real. The author clearly never heard of fmod
fmod takes in the order of 30+ cycles, probably more in year 2003 CPUs, vs 1 for cmp, 1 for sub, 1 for jmp.
Sure the lower bound is nicer here. But when the tradeoff includes an unlimited upper bound it's not a very attractive option.
I guess the most robust code handling both performance and unexpected input would be one iteration of this (leveraging the assumption that angles are either always within the bounds, or had one frame of going out of bounds by a small amount); followed by a fmod if that assumption is just totally off.