Comment by TeMPOraL
4 years ago
In a project I worked on few years ago, we had persistent performance problems, present for a long time before I showed up. One of the first things I did when I joined the team was do some unsolicited statistical profiling and narrow it down to database interactions. Since we were using a custom-written connector to a graph database, we assumed it's the connector, or the JSON-based API it's using.
The issue got on the back burner, but it bugged me for a long time. I was later writing some performance-intensive number crunching code, and ended up writing a hacky tracing profiler[0] to aid my work. Having that available, I took another swing at our database connector issue...
And it turned out the connector was perfectly fine, its overhead was lower than the time the DB typically spent filtering data. The problem was, where I expected a simple operation in our application to do a few DB queries, it did several hundreds of them! Something that, for some reason, wasn't obvious on the statistical profiler, but it stood out on the full trace like a sore thumb.
I cut the run time of most user-facing operations by half by doing a simple tweak to the data connector - it turns out even a tiny unnecessary inefficiency becomes important when it's run a thousand times in a row. But ultimately, we were doing 100x as many queries as we should have, nobody noticed, and fixing it would be a huge architecture rework. We put it on the roadmap, but then the company blew up for unrelated reasons.
I sometimes think that maybe if we tracked and fixed this problem early in the development, our sales would have been better, and the company would have been still alive. For sure, we'd be able to iterate faster.
--
No comments yet
Contribute on Hacker News ↗