Comment by nojokes

5 years ago

I am surprised that this is the top answer (Edit: at the moment, was)

How does splitting code into multiple functions suddenly change the order of the code?

I would expect that these functions would be still called in a very specific order.

And sometimes it does not even make sense to keep this order.

But here is a little example (in a made up pseudo code):

  function positiveInt calcMeaningOfLife(positiveInt[] values)
    positiveInt total = 0
    positiveInt max = 0
    for (positiveInti=0; i < values.length; i++) 
      total = total + values[i]
      max = values[i] > max ? values[i] : max
    return total - max

===>

  function positiveInt max(positiveInt[] values)
    positiveInt max = 0
    for (positiveInt i=0; i < values.length; i++) 
      max = values[i] > max ? values[i] : max
    return max

  function positiveInt total(positiveInt[] values)
    positiveInt total = 0
    for (positiveInt i=0; i < values.length; i++) 
      total = total + values[i]
    return total

  function positiveInt calcMeaningOfLife(positiveInt[] values)
    return total(values)-max(values)

Better? No?

7 comments

nojokes

viktree 5 years ago

> How does splitting code into multiple functions suddenly change the order of the code?

Regardless of how smart your compiler is and all the tricks it pulls to execute the codein much the same order, the order in which humans read the pseudo code is changed

  01. function positiveInt max(positiveInt[] values)
  02.   positiveInt max = 0
  03.   for (positiveInt i=0; i < values.length; i++) 
  04.     max = values[i] > max ? values[i] : max
  05.   return max

  07. function positiveInt total(positiveInt[] values)
  08.   positiveInt total = 0
  09.   for (positiveInt i=0; i < values.length; i++) 
  10.     total = total + values[i]
  11.   return total

  12. function positiveInt calcMeaningOfLife(positiveInt[] values)
  13.   return total(values) - max(values)

Your modern compiler will take care of order in which the code is executed, but as humans need to trace the code line-by-line as [13, 12, 01, 02, 03, 04, 05, 07, 08, 09, 10, 11]. By comparison, the inline case can be understood sequentially by reading lines 01 to 07 in order.

  01. function positiveInt calcMeaningOfLife(positiveInt[] values)
  02.   positiveInt total = 0
  03.   positiveInt max = 0
  04.   for (positiveInt i=0; i < values.length; i++) 
  05.     total = total + values[i]
  06.     max = values[i] > max ? values[i] : max
  07.   return total - max

> Better? No?

In most cases, yeah probably your better off with the two helper functions. max() and total() are common enough operations, and they are named well enough that we can easily guess their intent without having to read the function body.

However, depending on the size of the codebase, the complexity of the surrounding functions and the location of the two helper functions it's easy to see that this might not always be the case.

If you want to try and understand the code for the first time, or if you are trying to trace down some complex bug there's a chance having all the code inline would help you.

Further, splitting up a large inline function is more trivial than reassembling many small functions (hope you got your unit tests!).

> And sometimes it does not even make sense to keep this order.

Agreed. But naming and abstractions are not trival problems. Often times it's the larger/more complex codebases, where you see these practices get applied more dogmatically

nojokes 5 years ago
Well, inlining by the compiler would be expected but we do not only write the code for the machine but also for another human being (that could be yourself at another moment of time of course).
Splitting the code into smaller functions does not automatically warrant a better design, it is just one heuristic.
A naive implementation of the principle could perhaps have found a less optimal solution
function positiveInt max(positiveInt value1, positiveInt value2) return value1 > value2 ? value1 : value2 function positiveInt total(positiveInt value1, positiveInt value2) return value1 + value2 function positiveInt calcMeaningOfLife(positiveInt[] values) positiveInt total = 0 positiveInt max = 0 for (positiveInt i=0; i < values.length; i++) total = total(total, values[i]) max = max(max, values[i]) return total - max
Now this is a trivial example but we can imagine that instead of max and total we have some more complex calculations or even calls to some external system (a database, API etc).
When faced with a bug, I would certainly prefer the refactoring in the GP comment than one here (or the initial implementation).
I think that when inlining feels strictly necessary then there has been problem with boundary definition but I agree that being able to view one single execution path inlined can help to understand the implementation.
I completely agree that naming and abstractions are perhaps two most complicated problems.
- TeMPOraL 5 years ago
  
  > but we do not only write the code for the machine but also for another human being (that could be yourself at another moment of time of course).
  That's the thing, isn't it? Various arguments have been raised all across this thread, so I just want to put a spotlight on this principle, and say:
  Myself, based on my prior experience, find code with few larger functions much more readable than the one with lots of small functions. In fact, I'd like a tool that could perform the inlining described by the GP for me, whenever I'm working in a codebase that follows the "lots of tiny functions" pattern.
  Perhaps this is how my brain is wired, but when I try to understand unfamiliar code, the first thing I want to know is what it actually does, step by step, at low level, and only then, how these actions are structured into helpful abstractions. I need to see the lower levels before I'm comfortable with the higher ones. That's probably why I sometimes use step-by-step debugging as an aid to understanding the code...
  
  4 replies →