Comment by viktree

5 years ago

> How does splitting code into multiple functions suddenly change the order of the code?

Regardless of how smart your compiler is and all the tricks it pulls to execute the codein much the same order, the order in which humans read the pseudo code is changed

  01. function positiveInt max(positiveInt[] values)
  02.   positiveInt max = 0
  03.   for (positiveInt i=0; i < values.length; i++) 
  04.     max = values[i] > max ? values[i] : max
  05.   return max

  07. function positiveInt total(positiveInt[] values)
  08.   positiveInt total = 0
  09.   for (positiveInt i=0; i < values.length; i++) 
  10.     total = total + values[i]
  11.   return total

  12. function positiveInt calcMeaningOfLife(positiveInt[] values)
  13.   return total(values) - max(values)

Your modern compiler will take care of order in which the code is executed, but as humans need to trace the code line-by-line as [13, 12, 01, 02, 03, 04, 05, 07, 08, 09, 10, 11]. By comparison, the inline case can be understood sequentially by reading lines 01 to 07 in order.

  01. function positiveInt calcMeaningOfLife(positiveInt[] values)
  02.   positiveInt total = 0
  03.   positiveInt max = 0
  04.   for (positiveInt i=0; i < values.length; i++) 
  05.     total = total + values[i]
  06.     max = values[i] > max ? values[i] : max
  07.   return total - max

> Better? No?

In most cases, yeah probably your better off with the two helper functions. max() and total() are common enough operations, and they are named well enough that we can easily guess their intent without having to read the function body.

However, depending on the size of the codebase, the complexity of the surrounding functions and the location of the two helper functions it's easy to see that this might not always be the case.

If you want to try and understand the code for the first time, or if you are trying to trace down some complex bug there's a chance having all the code inline would help you.

Further, splitting up a large inline function is more trivial than reassembling many small functions (hope you got your unit tests!).

> And sometimes it does not even make sense to keep this order.

Agreed. But naming and abstractions are not trival problems. Often times it's the larger/more complex codebases, where you see these practices get applied more dogmatically

6 comments

viktree

nojokes 5 years ago

Well, inlining by the compiler would be expected but we do not only write the code for the machine but also for another human being (that could be yourself at another moment of time of course).

Splitting the code into smaller functions does not automatically warrant a better design, it is just one heuristic.

A naive implementation of the principle could perhaps have found a less optimal solution

  function positiveInt max(positiveInt value1, positiveInt value2)
    return value1 > value2 ? value1 : value2

  function positiveInt total(positiveInt value1, positiveInt value2)
    return value1 + value2 

  function positiveInt calcMeaningOfLife(positiveInt[] values)
    positiveInt total = 0
    positiveInt max = 0
    for (positiveInt i=0; i < values.length; i++)
      total = total(total, values[i])
      max = max(max, values[i])
    return total - max

Now this is a trivial example but we can imagine that instead of max and total we have some more complex calculations or even calls to some external system (a database, API etc).

When faced with a bug, I would certainly prefer the refactoring in the GP comment than one here (or the initial implementation).

I think that when inlining feels strictly necessary then there has been problem with boundary definition but I agree that being able to view one single execution path inlined can help to understand the implementation.

I completely agree that naming and abstractions are perhaps two most complicated problems.

TeMPOraL 5 years ago
> but we do not only write the code for the machine but also for another human being (that could be yourself at another moment of time of course).
That's the thing, isn't it? Various arguments have been raised all across this thread, so I just want to put a spotlight on this principle, and say:
Myself, based on my prior experience, find code with few larger functions much more readable than the one with lots of small functions. In fact, I'd like a tool that could perform the inlining described by the GP for me, whenever I'm working in a codebase that follows the "lots of tiny functions" pattern.
Perhaps this is how my brain is wired, but when I try to understand unfamiliar code, the first thing I want to know is what it actually does, step by step, at low level, and only then, how these actions are structured into helpful abstractions. I need to see the lower levels before I'm comfortable with the higher ones. That's probably why I sometimes use step-by-step debugging as an aid to understanding the code...
- hackinthebochs 5 years ago
  
  >the first thing I want to know is what it actually does, step by step, at low level
  I feel like we might be touching on some core differences between the top-down guys and the bottom-up guys. When I read low level code, what I'm trying to do is figure out what this code accomplishes, distinct from "what it's doing". Once I figure it out and can sum up its purpose in a short slogan, I mentally paper over that section with the slogan. Essentially I am reconstructing the higher level narrative from the low level code.
  And this is precisely why I advocate for more abstractions with names that describe its behavior; if the structure and the naming of the code provide me with these purposeful slogans for units of work, that's a massive win in terms of effort to comprehend the code. I wonder if how the bottom-up guys understand code is substantially different? Does your mental model of code resolve to "purposeful slogans" as stand-ins for low level code, or does your mental model mostly hold on to the low level detail even when reasoning about the high level?
  
  3 replies →