Comment by nemothekid
6 months ago
I want to assume that the GTA developers did this hack because it was faster than floating point division on the Playstation 2 or something.
But knowing they were able to they were able to blow up loading GTA5 by 5 minutes by just parsing json with sscanf, I don't have much hope.
IIRC the whole parsing performance issue was because the original code was written for the SP campaign of GTA5 that only had a handful of objects to parse data for. That was barely a blip in terms of performance impact and AFAIK was written years before GTAOnline was made (where it became an issue - and even then only became an issue much after GTAOnline was first made).
Writing some simple code that works with the data you expect to have without bothering with optimizations is fine, if anything it is one of the actual cases of "premature optimization": even with profiling no real time is spent on that code, your data wont make it spend any time and you should avoid wild guesses since chances are you'll be wrong (even if in this case it could be a correct guess, it'd be like a broken clock guessing the time is always 13:37).
The actual issue with that code was that, after they reused it for GTAOnline and started becoming a performance issue after some time as they added more objects, nobody thought to try and see what is wrong.
Are you actually arguing that using a JSON parser for JSON-formatted data is a premature optimization? The solution here was to use a different format, not a somewhat-JSON-compatible hacked together parser.
No, i'm arguing that it wasn't a performance issue for the original purpose of the code and it only became one at much later, in a different project and only after some time long after that code was pushed way beyond what it was originally meant to do.
The premature optimization would be trying to optimize that piece of code without that being necessary given what the code was meant to do.
They were not the only one to make that mistake e.g. rapidjson had to fix the same error, few people expect parsing one token out of sscanf to strlen the entire input (not only that but there are c++ APIs which call sscanf under the hood).
The second error of deduplicating values by linear scanning an array was way more egregious.
The real, systemic error is that dozens(?) of engineers worked on that product, supposedly often testing the online component and experiencing that wait time first hand; and none thought "wait, parsing JSON doesn't take that long, computers are fast! what's going on?"
I think someone estimated that error cost them millions in revenue? I'm pretty sure a fraction of that could afford an engineer who knows how fast computers ought to be.
GTA was never my wheelhouse, but from what I gathered GTA Online didn't have that much support, and since it was only the initial loading time, and it would have increased over time as the shop content increased, and a very fast machine (e.g. a dev machine) would have had less of an issue, the engineers working on it were probably not that incentivised to dig into it.
Like, even though it's pretty critical to initial user experience initial loading time is generally what gets disregarded the most.
> I'm pretty sure a fraction of that could afford an engineer who knows how fast computers ought to be.
It can, if someone cares enough or realises it's an issue, and then someone is motivated enough to dig into it, or has the time to.