← Back to context

Comment by CalChris

18 hours ago

I must have missed this class. How does one convert a recursive descent parser into an iterative one?

Every recursive algorithm can be made into an iterative algorithm if you use an explicit stack instead of the implicit call stack. It's not always cleaner.

In tail recursive algorithms, there is no stack, it's just a straight loop.

  def foo(state, acc): # if acc is needed
    if base-condition(state): return acc
    return foo(next(state), f(state, acc))

Is simply:

  def foo(state):
    acc = initial
    while not base-condition(state):
      acc = f(state, acc)
      state = next(state)
    return acc

If it's not tail recursive you introduce a stack, for instance a DFS on a binary tree:

  def search(node, val):
    if node is None: return False # empty tree, only need this check once
    stack = [node]
    while stack:
      n = stack.pop()
      if n.val == val: return True
      if n.right: stack.push(n.right)
      if n.left:  stack.push(n.left)
    return False

Note the insertion order is reversed from the recursive calls in a typical DFS. We want left to be searched first and then its children and then we "recur" back to right and deal with its children, so we need to push right into the stack first and then left.

When you have multiple mutually recursive functions (as is likely the case in a recursive descent parser) then things become more complicated, but it's still feasible.

  • Sometimes the messy translation into an explicit stack and dispatch loop is necessary, if you want to pause the calculation, serialize the current state, and reconstitute it later. (E.g., if you want to add disk-saved checkpoints, a frequent hassle in some of my projects.) Coroutines can get you the equivalent of pausing and resuming for a recursive routine, but I'm not aware of any language that lets you serialize the call stack.

    • > I'm not aware of any language that lets you serialize a whole call stack.

      That's basically what continuations provide. Scheme, SML, and others provide them.

      2 replies →

    • It is much easier and more maintainable to convert to continuation passing style. If you also use defunctionalization to allocate closures on a stack instead of a fresh heap allocation for every closure, you will achieve performance on par with an explicit stack. (In fact, defunctionalization is a mechanical transformation the produces exactly the data you would store in an explicit stack!)

      Before I knew about CPS and defunctionalization, I wrote a Python decorator that did exactly the transformation you describe. https://github.com/tylerhou/fiber. Now I know about CPS and defunctionalization, I realize that my work was not the best implementation (but it was a fun learning experience!).

  • If you have to use an explicit stack, you're still doing recursion. Useful if you're implementing the Ackermann function in FORTRAN 77, I suppose. Unless you're worried about running out of stack space, it's generally easier to reason about if you just use regular recursion.

    That only applies for non-primitive recursive functions, though. Most people rarely encounter those. For primitive recursive functions, it's all down to the nature of the function and what's idiomatic for your language. If you're using Scheme, for example, recursion is the name of the game.

  • When TCO recursion was first developed it was very explicitly called out as a syntactic and structurally improved GOTO but still fundamentally a GOTO that could take params.

    Recursion isn't physically real, any book that teaches the abstraction before explaining either the call stack (for non-TCO recursion) or in the GOTO context is a book actively trying to gatekeeper CS and weed out readers. (Not that any of those old pascal turbo/boreland books from the 90s actually shipped code that compiled.)

    I had several people on HN of all places try to "teach me" recursion after this simple line inside a larger comment:

    > It's like acting like recursion is physically real (it's not) when it smuggles in the call stack.

    Recursion IS real abstractly. It's just not physically real, it was developed before we knew how DNA/RNA encoding handles the same issues in a more performant way.

    • For basic structural recursion on finite data structures, all you're doing is case analysis and substitution (i.e. "plugging in" values). How is that gatekeeping?

      Math majors cover hundreds of algorithms per semester often using recursion without thinking much of it. For that matter, same with "higher order functions" (e.g. derivative operators, or even just translation/scaling of a graph). Even in introductory calculus, students cover things like fixed points (e.g. I remember discussing d^4/(dx)^4 sin = sin almost immediately after introducing derivatives).

      Thinking in terms of physical machines is useful for reasoning about performance, but seems like a distraction for learning to think logically/correctly, which seems like a more important first step?

    • I don't see how it would be gatekeeping.

      Recursive functions are a mathematical concept, like the "imaginary" number, or "trascendental" numbers. Or negative numbers for that matter.

      Simple example, the Fibonacci sequence. FIB(1) = 1 FIB(2) = 1 FIB(N) = FIB(N-1) + FIB(N-2)

      There's no programming language or "physical" implementation needed in order to calculate FIB(N) for arbitrary N. Pencil and paper will do for small numbers

    • > Recursion isn't physically real, any book that teaches the abstraction before explaining either the call stack (for non-TCO recursion) or in the GOTO context

      Do you also believe that loops and functions should only be taught after the call stack and goto are taught? Neither of them are real either by your measure.

      What a silly sentiment.

      1 reply →

    • > It's just not physically real, it was developed before we knew how DNA/RNA encoding handles the same issues in a more performant way.

      That was a sharp left turn -- how do you figure DNA/RNA are relevant here? I feel like iteration pre-dates our modern understanding of RNA in particular (though I could be mistaken) so I struggle to imagine how DNA/RNA were particularly informative in this regard.

You can do it by managing your own stack, but I would argue that doing so makes the algorithm LESS easy to scan and has its own sets of problems.

Same way as any other algorithm: with an explicit heap-allocated stack. I have a naive parser where I built operator precedence into the grammar as nested productions instead of using an algorithm like shunting yard. An input of ((((((1)))))) blew the stack. Converted it to an explicit stack and it was fine; deep for the call stack was not deep for my own heap-allocated stack. Nasty code, though--I think this serves as one of the counterexamples to the idea that the recursive code gets simpler when turning it into iteration. The original OP was talking more specifically about tail recursion, which a recursive descent parser won't be.

  • There are many languages out there that can grow the stack these days. If you have a language that can do tail recursion, it is really a very neat complement. In lisps it means being able to write a list building function in a direct consing way, and still being faster than the TCP-way.