← Back to context

Comment by dehrmann

5 years ago

> Algorithms and data strictures are important--to a point

For me, 90% of day-to-day algorithms use is not writing O(nm) or O(n^2) code, and knowing which data structures to use to avoid it (usually a hashtable, occasionally a heap or balanced tree).

It all depends on your use case. If your input is bound to a small N, I’ll take the simplest to understand implementation regardless of algorithmic complexity. Future you and other people will thank you later.

  • If small-bound is <20 and likely to never be >50, yes. Otherwise, be careful; future people will not thank you for that timebomb.

    • It’s safer when it’s hard coded data and really bad versus dynamic data and sorta bad. I’d rather have O(n!) with a list of length six because it’s going to break hard before it gets to 20. O(n root(n)) on your customer list will slowly creep up and it would be easy to not notice.

    • If you're really cool, you make it throw an error if n > 50 along with a comment explaining why, or at the very least leave a comment saying "this is not the optimal algorithm and will break if N gets much bigger than 50".

  • As the guy who has to deal with your code years later when the company is bigger...please think about the complexity, especially if it is a simple problem. Or not - that’s why I have a job

  • I don't know why you have been downvoted, you do have a point. Algorithms are tailored towards specific purposes. If you for example have something that runs with an n of about 5, and it's not in a hot path, the clear concise solution may well be better. Besides, with small n there is more to the cost of an algorithm than just its asymptotic growth.

    However, to the point of the person you've replied to: It's good to know that stuff to be able to assess the situation in the first place.

  • I'm fond of saying that one of the big differences between theory and practice in CS is that in practice, we don't immediately drop the coefficients. Instead we study the problem to learn what the coefficients are and use that knowledge to guide our approach. For instance, there are times a linear search will just flat out beat a binary search due to cache effects.

    • But if you care about the coefficients the first time you write the code, you are prematurely optimizing. The theory here is wiser than it intended to be :).

      First draft: don't be stupid with asymptotics as PP says.

      *only if there is actual problem in prod or whatever: bust out the concrete measurements.

  • It may also be faster for small N, as the big O notation swallows the constant factor. It’s not accidental that standard lib’s sort algorithms will fall back to simple insertion sort at the end of recursion when n is small. (Though please correct me if I’m wrong, I’ve only once digged into the sort algorithm’s code in cpp and java’s std lib)

  • in my 25 years of software development, 90% of the time, you don't need to optimize the code and the worst solution in time complexity is totally fine. then there is another 8% of situations where trading in space for time complexity is the solution, i.e. using lookup tables. the remaining 1-2% of the cases you might need a combination of some data structures or containers from the STL or boost (in the case of C++). I've never had to roll some special version of a data structure or container or algorithm myself from scratch, and I work on some really complicated real time software.