Comment by perl4ever
4 years ago
I used to work for people that processed emails and loaded them into databases with perl scripts. One day someone asked me if I could help, because the script they were running on a batch of emails was inexplicably freezing or running out of memory, I forget the exact details.
There were maybe a few thousand or tens of thousands of emails, and so, I came to look at the issue with my usual attitude which is that if it isn't running instantly on a GHz computer of any kind, something must be horrifically wrong. Not to mention running out of memory.
It turned out that essentially every email was cc'ed to several thousand people; you know, the thread went to a whole company or something. The file size itself wasn't huge in the scheme of things, but our perl script (written by someone much more elevated that me, with a master's degree) read the whole file at once into memory, and expanded the headers to perl hashes, multiplying the size.
But there was no reason the whole thing had to be in memory. Only that people learn to program in CS 101 these days, I guess, as if memory is infinite, because gigabytes might as well be infinity, but if you multiply a certain moderate overhead times tens of thousands by tens of thousands, suddenly all your gigabytes are gone.
Another thing I remember, was when I was given a T-SQL report that typically ran on a few thousand documents, and was asked to run it on all of the millions on all our databases. It was hundreds or thousands of times too slow, but it was running a SQL statement in a procedural loop per document and it could be turned into a single statement.
So my experience has basically taught me that if something is within a factor of two of optimal, it's good enough, but an incredible amount of code, regardless of high or low level, is way worse than that.
But I've never gotten much pay or glory for fixing this sort of thing. Sometimes I daydream about being a high paid consultant who goes and turns three nested loops into two, or two into one.
The other week I optimized some processing in a web app from 3 minutes to 11 seconds.
The customer was ecstatic, but I still think it is 2 orders of magnitude too slow.
There is a whole lot of low hanging fruit in the world. When I am new at a job if I don’t find several order or two of magnitude improvements I am impressed.