Comment by tombert
1 day ago
In my mind, the biggest advantage to using tail recursion over vanilla loops is the ability to keep using persistent data structures without any (apparent) mutation.
At least in theory, a tail recursive call will be converted into a dumb jump, so there shouldn't be a performance penalty, but since from your code's perspective you're passing in the stuff for the next iteration, you can keep using the pretty and easy-to-reason-about structures without creating any kind of mutable reference.
I'm not 100% sold on tail recursion in a broader sense, but at least with Clojure's loop/recur stuff it is kind of cool to be able to keep using persistent data structures across iterations of loops.
What makes tail recursion "special" is that there exists a semantically equivalent mutable/iterative implementation to something expressed logically as immutable/recursive. [0]
Of course, this means that the same implementation could also be directly expressed logically in a way that is mutable/iterative.
func pow(uint base, uint n): n == 0 ? return 1 : return n * pow(base, n-1)
is just
func pow(uint base, uint n): uint res = 1; for(i=0; i<n; i++){ res *= n} return res
There is no real "advantage" to, or reason to "sell" anybody on tail call recursion if you are able to easily and clearly represent both implementations, IMO. It is just a compiler/runtime optimization, which might make your code more "elegant" at the cost of obfuscating how it actually runs + new footguns from the possibility that code you think should use TCO actually not (because not all immutable + recursive functions can use TCO, only certain kinds, and your runtime may not even implement TCO in all cases where it theoretically should).
As an aside, in C++ there is something very similar to TCO called copy-elision/return-value-optimization (RVO): [1]. As with TCO it is IMO not something "buy into" or sell yourself on, it is just an optimization you can leverage when structuring your code in a way similar to what the article calls "continuation passing style". And just like TCO, RVO is neat but IMO slightly dangerous because it relies on implicit compiler/runtime optimizations that can be accidentally disabled or made non-applicable as code changes: if someone wanders in and makes small semantic to changes to my code relying on RVO/TCO for performance they could silently break something important.
[0] EXCEPT in practice all implementation differences/optimizations introduce observable side effects that can otherwise impact program correctness or semantics. For example, a program could (perhaps implicitly) rely on the fact that it errors out due to stack overflow when recursing > X times, and so enabling TCO could cause the program to enter new/undesirable states; or a program could rely on a functin F making X floating point operations taking at least Y cycles in at least Z microseconds, and not function properly when F takes less than Z microseconds after enabling vectorization. This is Hyrum's Law [2].
[1] https://en.wikipedia.org/wiki/Copy_elision#Return_value_opti...
[2] https://www.hyrumslaw.com/
> func pow(uint base, uint n): n == 0 ? return 1 : return n * pow(base, n-1)
Nitpick: that’s not tail recursive. Something like def pow(base, n, acc=1) = match n case 0 => acc; default => pow(base, n-1, acc*base)
Given that `base` is never used, I suspect that's not the only issue :)
> Of course, this means that the same implementation could also be directly expressed logically in a way that is mutable/iterative.
Yes, compilers exist.
> There is no real "advantage" to, or reason to "sell" anybody on tail call recursion if you are able to easily and clearly represent both implementations, IMO.
Avoiding mutation avoids headaches.
> [0] EXCEPT in practice all implementation differences/optimizations introduce observable side effects that can otherwise impact program correctness or semantics. For example, a program could (perhaps implicitly) rely on the fact that it errors out due to stack overflow when recursing > X times, and so enabling TCO could cause the program to enter new/undesirable states; or a program could rely on a functin F making X floating point operations taking at least Y cycles in at least Z microseconds, and not function properly when F takes less than Z microseconds after enabling vectorization. This is Hyrum's Law [2].
These programs are likely not standards compliant. (And that's not just true for the C++ standard but for basically any language with a standard.)
> And just like TCO, RVO is neat but IMO slightly dangerous because it relies on implicit compiler/runtime optimizations that can be accidentally disabled or made non-applicable as code changes:
Who says TCO has to be always implicit? In eg Scheme it's explicit in the standard, and in other languages you can have annotations.
(Whether a call is in tail position is more or less a property you can ascertain syntactically, so your annotation doesn't even have to be understood by the compiler: the linter is good enough. That would catch your 'accidental changes' part.
Things get more complicated, when you have implicit clean-ups happen after returning from the function. Like calling destructors in C++. Then the function call is arguably not in a tail position anymore, and so TCO doesn't apply. Whether this is detectable reliably at compile time depends on the details of your language.)
Avoiding mutation avoids headaches, but the real headaches are actions (mutations) at a distance, and tail recursion vs loops make no difference there.
RVO and URVO are slightly different from TCO in that’s the language guarantees that they are required to happen. You are correct though that small changes can accidentally turn it off unintentionally.
I would argue having the parameters that change during the loop be explicit is a very nice advantage. Agree that the things can be equivalent in terms of execution but the readability and explicitness, and the fact that all the parameters are listed in the same place is great.
Agreed. Some people really like FP a lot, and I think it's underrated that the kinds of functions where TCO is applicable tend to be so simple that they are not really that inelegant when expressed imperatively. My true opinion is that relying on TCO is usually choosing ideological adherence to FP (or "code that looks cooler") over reliability/performance/communication.
That said, just as I'd expect experienced C++ programmers to be able to recognize others' code using RVO and be careful not to restructure things to break it, I'd concede that experienced FP programmers might be unlikely to accidentally break others' TCO. It's just that ad absurdum you cannot expect every developer to be able to read every other developers' mind and recognize/workaround all implicit behavior they encounter.
1 reply →
With Named RVO (I.e. you explicitly `return named_variable;`) copy-elision is actually guaranteed by the standard. I believe returning the return value of a function call is also guaranteed to not do a copy constructor. Anything else is compiler and optimization level dependent.
To nitpick a bit, NRVO is an optimization as there is no guarantee that it will be performed, but RVO is now guaranteed (you can now return temporary non-copyable /non-movable objects for example).
I think you're confusing mutation and variable reassignment?
I'm saying that if I do a regular loop, something needs to explicitly mutate, either a reference or a value itself.
`for (var i = 0; i < n, i++)`, for example, requires that `i` mutate in order to keep track of the value so that the loop can eventually terminate. You could also do something like:
There, the reference to `i` is being mutated.
For tail recursion, it's a little different. If I did something like Factorial:
Doing this, there's no explicit mutation. It might be happening behind the scenes, but everything there from the code perspective is creating a new value.
In something like Clojure, you might do something like
(stolen from [1])
In this case, we're creating "new" structures on every iteration of the loop, so our code is "more pure", in that there's no mutation. It might be being mutated in the implementation but not from the author's perspective.
I don't think I'm confusing anything.
[1] https://clojure.org/reference/transients
You could imagine a language like eg Python with a for-each loop that creates a new variable for each run through the loop body.
Basically, just pretend the loop body is a lambda that you call for each run through the loop.
It might make sense to think of the common loops as a special kind of combinator (like 'map' and 'filter', 'reduce' etc.) And just like you shouldn't write everything in terms of 'reduce', even though you perhaps could, you shouldn't write everything in terms of the common loops either.
Make up new combinators, when you need them.
For comparison, in Haskell we seldom use 'naked' recursion directly: we typically define a combinator and then use it.
That often makes sense, even if you only use the combinator once.
> There, the reference to `i` is being mutated.
That's rebinding. Mutation is when you change the state of an object. Variables are not objects. You can't have a reference (aka pointer) pointing to a variable.
3 replies →
Where is the mutation?
2 replies →
Nope. Loops are unnecessary unless you have mutation. If you don't mutate, there's no need to run the same code again: if the state of the world did not change during iteration 1, and you run the same code on it again, the state of the world won't change during iteration 2.
>without creating any kind of mutable reference.
The parameter essentially becomes a mutable reference.
No. There's no mutation happening.
Of course, a compiler might do whatever shenanigans it has to do. But that's an implementation detail and doesn't change how the language works.
(And let's not pretend that CPUs execute something that resembles an imperative language like C under the hood. That might have been true during the PDP11 or a VAX days. These days with out-of-order execution, pipelining etc what's actually happening doesn't really resemble one-after-another imperative code much.)
I didn’t eat the sandwich, I just expressed its transformation into energy.
If it does, then you would be able to express a race condition with it.
EDIT: I re-read parent comment.
>> the biggest advantage to using tail recursion over vanilla loops is the ability to keep using persistent data structures without any (apparent) mutation.
Yes
>> at least with Clojure's loop/recur stuff it is kind of cool to be able to keep using persistent data structures across iterations of loops
There's the implied mutation.
No disagreement on that but that's an implementation detail and from the coder's perspective there's no mutable references.
From the coder's perspective it is. There's just different syntax for assigning to them.
1 reply →
From the coder's perspective, there are no mutable references only if the coder does not really rely on or care about the possibility that their code uses TCO. If they actively want TCO then they definitely care about the performance benefits they get from underlying mutability/memory reuse/frame elision.
1 reply →