Recursion and Tail Call Optimization

Posted on 2023-06-18

What is recursion anyway?

Recursion is a way to repeat or loop in more functional/declarative style programming - i.e. it's a more declarative version of a for loop.

For example, let's say a professor has a list of student's grades, but needs to add 3 points to all of them to the bring the curve up. This professor could solve complete this task in an imperative way:

let grades = [95, 70, 66];

function adjustGrades(grades, amount) {
  for (let i = 0; i < grades.length; i++) {
    grades[i] += amount;
  }
}

adjustGrades(grades, 3);

$ node
Welcome to Node.js v20.2.0.
Type ".help" for more information.
> .load grades.js
...
> grades
[ 98, 73, 69 ]

The imperative approach creates a mutable variable grades that points to the array of grade values, along with a mutable i variable that acts as a counter for the for loop. With these two variables, the for loop can begin. On each loop, a grade element is accessed from the array and mutated (in this case by adding 3 to each element's value).

But iterating through a list or array of values can be done without loops by using recursion. Recursion uses pattern matching and a function that calls itself in place of a counter and loop, but like loops, there must be some exit condition specifying when to stop. For recursion though, this "exit condition" must also return some final value.

Let's take a look at the same grade adjustment scenario above, but this time using recursion:

module Grades where

grades = [95, 70, 66]

adjustGrades _ [] = []
adjustGrades amount (head : tail) = (head + amount) : adjustGrades amount tail

$ ghci
GHCi, version 9.2.8: https://www.haskell.org/ghc/  :? for help
ghci> :l test.hs 
[1 of 1] Compiling Grades           ( grades.hs, interpreted )
Ok, one module loaded.
ghci> import Grades
ghci> adjustGrades 3 grades
[98,73,69]

adjustGrades is the recursive function doing the work here, and it has two "branches" so to say - an "exit condition" branch adjustGrades _ [] = [] and a continue recursing branch adjustGrades amount (hd : tl) = (hd + amount) : adjustGrades amount tl. These "branches" are Haskell's pattern matching feature, here used as part of the function definition.

The "exit condition" branch essentially says "if there's no more elements in the passed list, just return the empty list [] and be done", while the continue recursing branch says "if there's still some element in the passed list (here called the head of the list), add the amount passed to that element and start building a new list from the new element cons'd (think shifted) onto the value returned by re-calling the function on the rest of the list (here called the tail of the list)".

A Pictorial Example

If you're not used to recursion, that last paragraph would sound quite confusing. Let's consider how our professor would manually add three points to the exams she has to see if we can clear things up. Let's assume all the exams are placed in a row on a desk, like so:

Professor waiting to start the process, with stack of exams to left of

The professor begins with the stack of exams and starts by laying out the first exam on the table and putting a sticky note on it that says "Add 3" so she can remember that the exam's score needs to have 3 points added to it:

Professor in starting position (position 1), but with first exam pulled off stack and put in front of, with Add 3 sticky note stuck to

The professor repeats this for the middle exam:

Professor moved down to position 2, with the middle exam now placed in front of

And the final exam:

Professor moved down to position 3, with final exam now placed in front of

And finally the empty list (think of it as the bin that the exams sit in):

Professor moved down to position 4, with empty list/exam bin placed in front of

This completes the recursive process, as the professor has run out of things to pull and put on the table. Now the professor begins to process (i.e. evaluate) each exam and put it back in the bin/cons it onto the (empty) list:

The Professor moved back to position 3, with the final exam (with new grade) placed in the empty list/bin in front of

The professor does the same for each of the remaining exams, first the middle exam and then the first exam, each time consing the exam onto the new list/placing the exam on top of the bin:

The Professor moved back to position 2, with the middle exam (with new grade) placed on top of the list/bin in front of

The Professor moved back to position 1, with the first exam (with new grade) placed on top of the list/bin in front of

Recursion can easily model repetitive processes by repeating a certain function until an end condition is met, in a sense expanding out the behavior by making function call on top of function call. Once the end condition is met, and the process goes back in reverse - evaluating the final function call, then the next, ... until we get back to the original function and can finally return a value:

# Recurse...
adjustGrades 3 [95, 70, 66]
(95 + 3) : adjustedGrades 3 [70, 66]
(95 + 3) : (70 + 3) : adjustGrades 3 [66]
(95 + 3) : (70 + 3) : (66 + 3) : adjustGrades 3 [] # Hit end condition

# Evaluate...
(95 + 3) : (70 + 3) : (66 + 3) : [] # adjustGrades 3 [] = []
(95 + 3) : (70 + 3) : [69]
(95 + 3) : [73, 69]
[98, 73, 69]

Tail Call Optimization

The Problem with Recursion

You may have noticed that recursion with long lists could be...well, a bit of a problem: with an infinite list to go through, the recursive calls will keep piling one on top of the other in an infinite chain of expressions that need to be evaluated:

exp1 : exp2 : exp3 : exp4 : exp5 : exp6 :  ... expn

Of course, no computer has infinite memory, so this type of recursion becomes a problem with large data sets.

The Solution

How can the benefits of recursion be enjoyed without the drawback of a long expression chain building up and taking up memory? The answer is tail call optimization.

Tail call optimization sounds complicated, but it's actually very simple. Instead of recursion that relies on a future return value from a future recursive call, what if the each recursive call can completely evaluate itself and move on to the next recursive call?

-- Recursion (not Tail Call Optimizable)
adjustGrades _ [] = []
adjustGrades amount (head : tail) = (head + amount) : adjustGrades amount tail

-- Tail Call Optimizable Recursion
adjustGrades' _ [] accumulator = accumulator
adjustGrades' amount (head : tail) accumulator =
    adjustGrades' amount tail ((head + amount) : accumulator)
    
adjustGrades amount grades = (adjustGrades' amount grades [])

To compare how there two implementations work:

# Recursion (not Tail Call Optimizable)

adjustGrades 3 [95, 70, 66]
(95 + 3) : adjustedGrades [70, 66]
(95 + 3) : (70 + 3) : adjustGrades [66]
(95 + 3) : (70 + 3) : (66 + 3) : adjustGrades [] # Hit end condition
(95 + 3) : (70 + 3) : (66 + 3) : [] # adjustGrades [] = []
(95 + 3) : (70 + 3) : [69]
(95 + 3) : [73, 69]
[98, 73, 69]


# Tail Call Optimizable Recursion
adjustGrades 3 [95, 70, 66]
adjustGrades' 3 [95, 70, 66] []
adjustGrades' 3 [70, 66] [98]
adjustGrades' 3 [66] [73, 98]
adjustGrades' 3 [] [69, 73, 98]
[69, 73, 98]

Notice that tail call optimizable recursion doesn't build up a chain, it just evaluates one function and moves on to the next one, using an accumulator to keep track. Now we can use recursion with that takes up constant rather than linear space; unfortunately though we do end up with the new list being in the reverse order of the original.

Thankfully, a reverse function can easily be implemented with tail call optimizable recursion and be added to get things back in the right order:

reverse [] accumulator = accumulator
reverse (hd : tail) = reverse tail (hd : accumulator)

adjustGrades amount grades = reverse (adjustGrades' amount grades [])

I keep saying tail call optimizable because implementing adjustGrades this way does not inherently mean you get the benefits of tail call optimization - the language you are using must support this optimization.

Summary

Recursion can be an interesting and potentially more high-level way of iterating. Perhaps instead of using a for or while loop in the future, you can try out recursion and see how it goes. If you use more functional-leaning languages like Scala, Ocaml, or Clojure, you can also benefit writing your recursive function in a way that it can be tail call optimized.

The nice thing about getting used to recursion is that many languages now have functions based on it, like: map, filter, forEach, reduce. Many mainstream languages (including Ruby, Swift, Python, JavaScript, Rust, Java...) have added some or all of these functions, and to truly understand ow they work you must understand how recursion works.

Once you come to understand recursion, code like this makes loops feel complicated: