← Back to context

Comment by 01100011

7 hours ago

Rule 3 gets me into trouble with CS majors a lot. I'm an EE by education and entered into SW via the bottom floor(embedded C/ASM) so it was late in my career before I knew the formal definition of big-O and complexity.

For most of my career, sticking to rule 3 made the most sense. When the CS major would be annoying and talk about big-O they usually forgot n was tiny. But then my job changed. I started working on different things. Suddenly my job started sounding more like a leetcode interview people complain about. Now n really is big and now it really does matter.

Keep in mind that Rob Pike comes from a different era when programming for 'big iron' looked a lot more like programming for an embedded microcontroller now.

I actually disagree with Rule 3! While numbers are usually small being fast on small cases generally isn't as important as performing acceptably on large cases. So I prefer to take the better big-O so that it doesn't slow down unacceptably on real-world edge-case stresses. (The type of workloads that the devs often don't experience but your big customers will.)

Of course there is a balance to this, the engineering time to implement both options is an important consideration. But given both algorithms are relatively easy to implement I will default to the one that is faster at large sizes even if it is slower at common sizes. I do suspect that there is an implicit assumption that "fancy" algorithms take longer and are harder to implement. But in many cases both algorithms are in the standard library and just need to be selected. If this post focused on "fancy" in terms of actual time to implement rather than speed for common sizes I would be more inclined to agree with it.

I wrote an article about this a while back: https://kevincox.ca/2023/05/09/less-than-quadratic/

  • Rule 3 was true 1989, back then computers were so slow and had barely any ram so most things you did only was reasonable for small number of inputs. Today we almost always have large amounts of inputs so its different.

Well it is hedged with the word "fancy". I think a charitable reading is to understand the problem domain. If N is always small then trying to minimize the big-O is just showing off and likely counterproductive in many ways. If N is large, it might be a requirement.

Most people don't need FFT algorithm for multiplying large numbers, Karatsuba's algorithm is fine. But in some domains the difference does matter.

Personally I usually see the opposite effect - people first reach for a too-naive approach and implement some O(n^2) algorithm where it wouldn't have even been more complex to implement something O(n) or O(n log n). And n is almost always small so it works fine, until it blows up spectacularly.

  • > Personally I usually see the opposite effect - people first reach for a too-naive approach and implement some O(n^2) algorithm where it wouldn't have even been more complex to implement something O(n) or O(n log n). And n is almost always small so it works fine, until it blows up spectacularly.

    Same. People solve in ways that are very obviously going to cause serious problems in only a few short weeks or months and it’s endlessly frustrating. If you’re building a prototype, fine, but if you’re building for production, very far from fine.

    Most frustrating because often there’s next to no cost in selecting and implementing the correct architecture, domain model, data structure, or algorithm up front.

My father did some programming in Fortran and Assembly of various flavors. He was always partial to lookup tables where they could replace complicated conditionals or computations. Memory was precious in his day but it could still be worth it if your program did something repeatedly (which most do).