The basic idea of recursion is a function that calls itself. This is a simple idea, but is particularly common, as it’s a useful way to solve problems that can be broken down into one or more similar, but smaller problems.
This typically comes up in technical interviews for software engineers applying for competitive software engineering positions. A typical process involves:
This lesson will cover all of these steps, using C++.
The most basic example of a recursive function is this:
void Function(){
Function();
}
The main use case for recursion is when we encounter a problem that can be broken down into one or more similar, but smaller problems.
This function will behave somewhat similarly to an infinite loop. It will load copies of itself onto the stack until the stack runs out of memory and terminates our program.
That’s not especially useful, so a recursive function generally has two parts:
return
statementIn this example, we show a function that prints a countdown, using a recursive design:
#include <iostream>
void Countdown(int x){
// Base Case
if (x <= 0) return;
// Recursive Case
std::cout << x << ", ";
Countdown(x - 1);
}
int main(){ Countdown(5); }
5, 4, 3, 2, 1,
Fibonacci numbers are a sequence of integers that have many applications in maths and computer science. They are also closely related to the golden ratio, a phenomenon that is frequently observed in nature. More information is available on Wikipedia
The first two Fibonacci numbers are the same: 1
and 1
. From there, each subsequent number is generated by adding the previous two. So, the first 8 numbers in the sequence are: 1, 1, 2, 3, 5, 8, 13, 21
Generating a Fibonacci number is a problem that falls into the domain where recursion is useful. This is because the problem can be solved by solving two similar, but smaller problems.
For example, to generate the 10th Fibonacci number, we calculate the 9th and 8th numbers and then add them together.
Fibonacci(10) == Fibonacci(9) + Fibonacci(8)
More generally:
Fibonacci(n) == Fibonacci(n-1) + Fibonacci(n-2)
We can implement it in C++ like this:
#include <iostream>
int Fibonacci(int n) {
// Base Cases
if (n == 1) return 1;
if (n == 2) return 1;
// Recursive Case
return (Fibonacci(n - 1) + Fibonacci(n - 2));
}
int main() {
std::cout << "10th Fibonacci Number: "
<< Fibonacci(10);
}
10th Fibonacci Number: 55
When working with recursion, we should be mindful of performance considerations. This is particularly true when our function is making multiple recursive calls, as in the previous Fibonacci example.
Let's modify our example to see how many calls we’re making:
#include <iostream>
int Calls{0};
int Fibonacci(int n){
++Calls;
if (n == 1) return 1;
if (n == 2) return 1;
return (Fibonacci(n - 1) + Fibonacci(n - 2));
}
int main(){
std::cout << "10th Fibonacci Number: "
<< Fibonacci(10);
std::cout << "\nCalls: " << Calls;
}
Perhaps surprisingly, calculating the 10th Fibonacci number using this technique requires 109 calls to our function.
10th Fibonacci Number: 55
Calls: 109
Calculating the 20th Fibonacci number requires 13,529 calls. Calculating the 40th requires 205 million calls, which causes a noticeable delay whilst our computer works it out.
In computer science, this is an example of an algorithm that scales exponentially. The number of operations required is in the order of $2^n$ - that is, every time $n$ increases by 1, the number of operations required to complete the algorithm approximately doubles.
This is commonly described in "big O" notation as an $O(2^n)$Â algorithm.
To understand why this algorithm requires so much work, let's consider our call to Fibonacci(40)
. It’s going to call Fibonacci(39)
and Fibonacci(38)
.
We now have two branches of our recursion, the one created by the call to Fibonacci(39)
and the one starting at Fibonacci(38)
However, the Fibonacci(39)
branch is also going to call Fibonacci(38)
. Because of this, we now have two duplicate branches that are each independently doing exactly the same thing.
The Fibonacci(39)
branch and both Fibonacci(38)
branches will call Fibonacci(37)
, so we will have three of those:
This wasteful duplication compounds as we move further down the tree. The two Fibonacci(38)
branches and the three Fibonacci(37)
branches will all call Fibonacci(36)
so we now have 5 separate branches doing the same calculation.
By the time we reach Fibonacci(20)
we will have 10,946 branches doing the same calculation.
In a coincidental demonstration of how ubiquitous the Fibonacci sequence is, it appears in the scaling of this algorithm to calculate itself:
Fibonacci(40)
is called 1 timeFibonacci(39)
is called 1 timeFibonacci(38)
is called 2 timesFibonacci(37)
is called 3 timesFibonacci(36)
is called 5 timesFibonacci(35)
is called 8 times…and so on. 10,946 is the 21st Fibonacci number.
The fact that this particular implementation of the algorithm is being used to calculate Fibonacci numbers is entirely a coincidence. The sequence can emerge from any algorithm that involves two recursive function calls.
By Fibonacci(10)
, we have over a million branches. And, as we saw from our code example, each of those Fibonacci(10)
branches requires 109 calls to determine the 10th Fibonacci number - over 100 million function calls just in that part of the tree.
This form of understanding recursive algorithms has some commonly used terminology.
Each call of a recursive function creates a recursion tree, which is what we’ve illustrated in the diagrams above.
Each function call in the tree is sometimes referred to as a node
The branching factor is the number of child nodes each node creates. We can imagine it being how many branches are created at each layer of the tree. In this example, it is 2, as each function call makes two further function calls (Fibonacci(n-1)
and Fibonacci(n-2)
).
The recursion depth is the largest number of function calls (or nodes) in a single branch. We can imagine it being approximately equal to the height (or depth) of the tree. In this example, it is 40, as the longest branch has 40 nodes (from Fibonacci(40)
to Fibonacci(1)
)
So, we have a performance problem caused by massive duplication of the same work. How do we solve this problem? Caching is the process of saving data we previously calculated, so we don’t need to calculate it again.
Memoization is a specific form of caching, where we cache the result of a function call for a given set of arguments. So, before the function returns the result, it stores the result in the cache.
Then, in the future, if the function is called again with those same arguments, we can get the return value from the cache, rather than recalculate it.
If a function can return a value from the cache without needing to calculate it, this is sometimes referred to as a cache hit.
If it checks the cache and doesn’t find an appropriate entry, it needs to run the full calculation. This is referred to as a cache miss.
Typically, hashmaps such as a std::unordered_map
are used for this cache.
We use a map below and update our recursive function to make use of it. If our Fibonacci()
function is called with a number it has already calculated, that number will be in the Cache
map, so we can return it from there rather than recalculating it:
#include <iostream>
#include <unordered_map>
int Calls{0};
std::unordered_map<int, int> Cache;
int Fibonacci(int n){
++Calls;
if (n == 1) return 1;
if (n == 2) return 1;
if (Cache.contains(n)) return Cache[n];
Cache[n] = Fibonacci(n - 1) +
Fibonacci(n - 2);
return Cache[n];
}
int main(){
std::cout << Fibonacci(10)
<< " (" << Calls << " calls)";
}
Now, our calculation of the 10th Fibonacci number only takes 17 calls, rather than 109.
55 (17 calls)
Finding the 40th number is now noticeably faster, as it takes only 77 calls, down from 205 million. We’ve moved from an algorithm that scales exponentially ($O(2^n)$) to one that scales linearly ($O(n)$)
We’ve seen that caches are essentially key-value pairs. The key is the input to a function (or some process), whilst the value is the output that the process produces.
In the Fibonacci example, the input to our function was simple - a single, integer argument. The return value of Fibonacci(40)
is stored in our cache with a key of 40
.
But things are not always that simple, so we sometimes need to dedicate some thought to how we create our keys.
For example, what if our function had two arguments? What key would we use to store a function call with the arguments (12, 3)
?
We could add them, such that our key would be 15. But (10, 5)
would also have a cache key of 15. For our hypothetical function, are these argument lists equivalent? If so, our key strategy is suitable, but if not, it will cause bugs.
Another strategy might be concatenating the arguments, such that the key is 123
. But then that would mean (12, 3)
and (1, 23)
would be treated as equivalent.
Another more robust strategy is to concatenate them, with a separator between arguments, such as a comma. For example, our key for the function call (12, 3)
could be the string "12,3"
. whilst (1, 23)
would be "1,23"
This is a very common technique, but even this is not a universal solution. For example, it will not work if our arguments are strings, which may include their own commas. And our arguments may not be easily convertible to strings - they may be complex objects from a custom class.
The key point here is that we need to be considerate in how we generate cache keys, and there is no universal strategy. We need to adapt our technique to our specific function.
There are other considerations we have to be aware of when dealing with memoization. The first is around side effects. A side effect is anything a function does, besides returning a value. It can include calling other functions, updating the values of variables in the parent scope, freeing memory, and more.
In the following example, our function has the side effect of logging into the console. But, whether that happens depends on whether or not we had a cache hit or miss.
int Square(int x){
if (Cache.contains(x)) return Cache[x];
std::cout << "Squaring!"; // Side effect
return x * x;
}
A function whose observable behavior changes depending on whether or not something is cached is generally a red flag and should prompt us to reconsider our implementation.
Another aspect to consider is that a function’s return value does not necessarily depend on just its arguments. It can also access values from other locations, such as parent scopes or calls to other functions.
The following function will return the wrong value if it encounters a cache hit, but the Multiplier
variable in the parent scope has changed since that entry was added to the cache.
int Multiplier { 5 };
int Multiply(int x){
if (Cache.contains(x)) return Cache[x];
return x * Multiplier;
}
When we decide to memorize a function, we need to consider these possibilities and mitigate them. Even better, we can create functions that don’t have side effects and whose output depends only on its inputs, rather than some external state.
A function that does not have any observable side effects, and always returns the same output for a given set of arguments, is called a pure function.
Beyond being easier to memoize, they also tend to be easier to test, and easier to debug. So, when given the opportunity, we should generally prefer to make our functions pure.
In this lesson, we explored recursion, and how it can solve problems by breaking them down into simpler versions of themselves. The main points we covered include:
An introduction to recursive functions, their use cases, and how to optimize their performance
Comprehensive course covering advanced concepts, and how to use them on large-scale projects.