Comment by geerlingguy
4 years ago
So many times, in higher level code, it's seeing a foreach loop in a foreach loop, and the nested loop is calling an API or re-rerunning the same database call 5000 times.
Move things around or just use a cache... and instant 1000%+ speedup.
I've seen this too many times to count, often in apps and on sites that are in fairly heavy use.
The answer is often to scale up and pay for 10x more server capacity to handle the load, rather than spend some time optimizing the slow code paths.
My fav optimization story:
A few years back, I was doing some geographic calculations. Basically building a box of lat/lngs and getting all the points within that box.
It was slow. Weirdly slow. I made sure the lat and lng of the records were in the index, but it was still slow.
More testing revealed that the way I was passing the lat/lng into the query was causing those values to be converted to strings, which were converted back to numbers, but caused the index to not be used. This meant the database had to do a lot more work.
Converting the parameters to numbers made sure the index could be used, which lead to a nice speed up.
Edit: I thought I wrote this up, but I didn't. But it turns out there was some interesting sorting shenanigans that I did: https://www.mooreds.com/wordpress/archives/547
Sometimes it's worth peering down through those layers of abstraction!
At least that was an accidental conversion and an understandable mistake, I've seen the same with dates stored as strings from devs unaware that dates are just integers with a lot of maths to make them meaningful.
At once place our end of month billing was getting slower and slower over the course of months, from one hour out to about twelve and it had to baby sit and run in batches in order to not bring the whole database to it's knees. We couldn't change the mess of legacy classic asp of course, but adding a couple of actual date columns calculated from the string fields on insert/update bought the whole process down to seconds.
Whoa, that's an awesome story. Nuts that the slowdown happened over such a short period, must have been a fair amount of data running through that system. Lots of table scans <shiver>.
In a project I worked on few years ago, we had persistent performance problems, present for a long time before I showed up. One of the first things I did when I joined the team was do some unsolicited statistical profiling and narrow it down to database interactions. Since we were using a custom-written connector to a graph database, we assumed it's the connector, or the JSON-based API it's using.
The issue got on the back burner, but it bugged me for a long time. I was later writing some performance-intensive number crunching code, and ended up writing a hacky tracing profiler[0] to aid my work. Having that available, I took another swing at our database connector issue...
And it turned out the connector was perfectly fine, its overhead was lower than the time the DB typically spent filtering data. The problem was, where I expected a simple operation in our application to do a few DB queries, it did several hundreds of them! Something that, for some reason, wasn't obvious on the statistical profiler, but it stood out on the full trace like a sore thumb.
I cut the run time of most user-facing operations by half by doing a simple tweak to the data connector - it turns out even a tiny unnecessary inefficiency becomes important when it's run a thousand times in a row. But ultimately, we were doing 100x as many queries as we should have, nobody noticed, and fixing it would be a huge architecture rework. We put it on the roadmap, but then the company blew up for unrelated reasons.
I sometimes think that maybe if we tracked and fixed this problem early in the development, our sales would have been better, and the company would have been still alive. For sure, we'd be able to iterate faster.
--
[0] - https://github.com/TeMPOraL/tracer