Comment by weitendorf
17 hours ago
What makes tail recursion "special" is that there exists a semantically equivalent mutable/iterative implementation to something expressed logically as immutable/recursive. [0]
Of course, this means that the same implementation could also be directly expressed logically in a way that is mutable/iterative.
func pow(uint base, uint n): n == 0 ? return 1 : return n * pow(base, n-1)
is just
func pow(uint base, uint n): uint res = 1; for(i=0; i<n; i++){ res *= n} return res
There is no real "advantage" to, or reason to "sell" anybody on tail call recursion if you are able to easily and clearly represent both implementations, IMO. It is just a compiler/runtime optimization, which might make your code more "elegant" at the cost of obfuscating how it actually runs + new footguns from the possibility that code you think should use TCO actually not (because not all immutable + recursive functions can use TCO, only certain kinds, and your runtime may not even implement TCO in all cases where it theoretically should).
As an aside, in C++ there is something very similar to TCO called copy-elision/return-value-optimization (RVO): [1]. As with TCO it is IMO not something "buy into" or sell yourself on, it is just an optimization you can leverage when structuring your code in a way similar to what the article calls "continuation passing style". And just like TCO, RVO is neat but IMO slightly dangerous because it relies on implicit compiler/runtime optimizations that can be accidentally disabled or made non-applicable as code changes: if someone wanders in and makes small semantic to changes to my code relying on RVO/TCO for performance they could silently break something important.
[0] EXCEPT in practice all implementation differences/optimizations introduce observable side effects that can otherwise impact program correctness or semantics. For example, a program could (perhaps implicitly) rely on the fact that it errors out due to stack overflow when recursing > X times, and so enabling TCO could cause the program to enter new/undesirable states; or a program could rely on a functin F making X floating point operations taking at least Y cycles in at least Z microseconds, and not function properly when F takes less than Z microseconds after enabling vectorization. This is Hyrum's Law [2].
[1] https://en.wikipedia.org/wiki/Copy_elision#Return_value_opti...
RVO and URVO are slightly different from TCO in that’s the language guarantees that they are required to happen. You are correct though that small changes can accidentally turn it off unintentionally.
With Named RVO (I.e. you explicitly `return named_variable;`) copy-elision is actually guaranteed by the standard. I believe returning the return value of a function call is also guaranteed to not do a copy constructor. Anything else is compiler and optimization level dependent.
> func pow(uint base, uint n): n == 0 ? return 1 : return n * pow(base, n-1)
Nitpick: that’s not tail recursive. Something like def pow(base, n, acc=1) = match n case 0 => acc; default => pow(base, n-1, acc*base)
Given that `base` is never used, I suspect that's not the only issue :)
> Of course, this means that the same implementation could also be directly expressed logically in a way that is mutable/iterative.
Yes, compilers exist.
> There is no real "advantage" to, or reason to "sell" anybody on tail call recursion if you are able to easily and clearly represent both implementations, IMO.
Avoiding mutation avoids headaches.
> [0] EXCEPT in practice all implementation differences/optimizations introduce observable side effects that can otherwise impact program correctness or semantics. For example, a program could (perhaps implicitly) rely on the fact that it errors out due to stack overflow when recursing > X times, and so enabling TCO could cause the program to enter new/undesirable states; or a program could rely on a functin F making X floating point operations taking at least Y cycles in at least Z microseconds, and not function properly when F takes less than Z microseconds after enabling vectorization. This is Hyrum's Law [2].
These programs are likely not standards compliant. (And that's not just true for the C++ standard but for basically any language with a standard.)
> And just like TCO, RVO is neat but IMO slightly dangerous because it relies on implicit compiler/runtime optimizations that can be accidentally disabled or made non-applicable as code changes:
Who says TCO has to be always implicit? In eg Scheme it's explicit in the standard, and in other languages you can have annotations.
(Whether a call is in tail position is more or less a property you can ascertain syntactically, so your annotation doesn't even have to be understood by the compiler: the linter is good enough. That would catch your 'accidental changes' part.
Things get more complicated, when you have implicit clean-ups happen after returning from the function. Like calling destructors in C++. Then the function call is arguably not in a tail position anymore, and so TCO doesn't apply. Whether this is detectable reliably at compile time depends on the details of your language.)
I would argue having the parameters that change during the loop be explicit is a very nice advantage. Agree that the things can be equivalent in terms of execution but the readability and explicitness, and the fact that all the parameters are listed in the same place is great.
Agreed. Some people really like FP a lot, and I think it's underrated that the kinds of functions where TCO is applicable tend to be so simple that they are not really that inelegant when expressed imperatively. My true opinion is that relying on TCO is usually choosing ideological adherence to FP (or "code that looks cooler") over reliability/performance/communication.
That said, just as I'd expect experienced C++ programmers to be able to recognize others' code using RVO and be careful not to restructure things to break it, I'd concede that experienced FP programmers might be unlikely to accidentally break others' TCO. It's just that ad absurdum you cannot expect every developer to be able to read every other developers' mind and recognize/workaround all implicit behavior they encounter.
> [...] and I think it's underrated that the kinds of functions where TCO is applicable tend to be so simple that they are not really that inelegant when expressed imperatively.
I suspect you are only thinking of patterns that are basically equivalent to a loop. I might agree with that.
TCO really shines when you want to implement state machines. See eg https://news.ycombinator.com/item?id=43076088 for an example where using tail calls in Python's interpreter loop gives nice performance benefits. Similar also for LUA.
> [...] I'd concede that experienced FP programmers might be unlikely to accidentally break others' TCO.
Compiler (or linter) checkable annotations would help here. You are right that you want to make it possible for programmers to statically assert somehow that their function call is a tail call.
Btw this reminds me: recursion isn't just something you do in computation, but also in data structures (amongst other things). In eg Rust the members of your data structure are typically just laid out one after another, but when you have a recursive structure (and in certain other cases) you need to box it, otherwise you'd get an infinitely large data structure. Boxing is more or less equivalent to using indirection via a pointer.
However, unboxing isn't really like TCO. It's more like in-lining.
To nitpick a bit, NRVO is an optimization as there is no guarantee that it will be performed, but RVO is now guaranteed (you can now return temporary non-copyable /non-movable objects for example).