Why and How you should always try to replace Recursion with something else in .NET C#
One of the famous mistakes I come across from time to time while performing code reviews is the excessive uncalled-for usages of Recursion.
I can hear someone asking now:
What is wrong with Recursion?!!
My answer is:
Recursion is not always bad if you know when and how to use it. Sometimes there is some better alternatives which you could be missing.
The main concern about using recursion is that it is too expensive in terms of memory consumption. That’s why you need to be cautious about using it.
You don’t believe me, right? Let me show you an example.
The Nodes Series Example
Let’s assume that we have a series of nodes where each Node refers to the node next to it. Our main goal is to traverse this series of nodes.
So, first we define our Node class as follows:
Then now it is time to implement a method to traverse the tree as follows:
In this method, we are using recursion in such a way that each Node is responsible for traversing the rest of the series starting from it.
Great right? Ok, it would work but let me show you something.
Let’s try to do the same job but now with using loops instead of recursion. So, following this way of thinking, we should end up with something like this:
As you can see, in this method we are using loops to traverse the nodes.
Now, let’s start calling these methods and see if they are going to produce the same result or not.
Here we are just creating a series of 5000 nodes and then calling the two methods; TraverseTreeUsingRecursion and TraverseTreeUsingLoops.
Running the application, we should get something like this:
As you can see, both methods are returning the same result, a list of 5000 node names.
However, can we say that both methods are exactly the same in terms of performance and memory consumption? Let me show you something interesting.
I ran a memory analysis between both methods and the results were as follows:
Can you notice the huge difference between the memory allocation in both cases? Yes, this is true; Recursion is too expensive.
However, to be totally honest with you, on this example, even with Recursion, we can apply some modifications to enhance the performance and memory allocation to match the loop implementation.
Now I hear you asking:
How???
Let me show you something interesting. Let’s examine how Recursion works.
This is a simply Recursive function. As you can see, we have some code before the recursion call, the recursion call, and finally the code after the recursion call.
Now, when calling this function, this is what happens in terms of memory allocation.
As you can see, the memory allocation increases while moving through the recursive calls and memory deallocation doesn’t start till the last/deepest recursive call is being resolved.
Additionally, worth to notice here is that the {code before the recursion call} has the biggest contribution in the memory allocation. Therefore, the biggest memory allocation this code needs, the most expensive the whole recursion call would be.
From this we can conclude that if we can minimize the memory allocation needed for the {code before the recursion call}, this would significantly decrease the total memory allocation of the whole recursive method.
Now, back to our example. Trying to minimize the memory allocation of the {code before the recursion call}, we can do the following:
See, instead of creating a new List<string> on each recursion call, we just pass in a reference to a list. Therefore, the same list is used on all levels so no memory allocation is needed on each recursion call.
Now, calling TraverseTreeUsingRecursion should be as follows:
And the memory analysis would be as follows:
See, it is totally different.
Worth to mention here that it is not always easy to achieve the same methodology of enhancement on Recursion methods. Sometimes the call is too complicated. In this case, we have other ways to deal with it.
Findings
As I told you, I used to come across some excessive uncalled-for usages of Recursion while performing code reviews. This has been happening for a while.
So, I decided to study this manner trying to understand what goes in the developer’s mind at that point. Therefore, I decided to stop and ask the developer each time I come across such case.
Most of the answers were about that he didn’t think much about it. Using recursion then was the logical option to him that he didn’t even stop and think about it.
Therefore, it is obvious that sometimes Recursion seems to be more natural and resembling the way our mind works. Simply, when you find yourself repeating some logic in your mind, your brain instantaneously scream Recursion.
So, if you ask me, I will advise you to fight the urge of using Recursion and always try to find an alternative.
Is It Always Possible?
Now, the question that might pop up in your head is:
Can we always replace Recursion with using loops?
Not exactly. Sometimes recursion might be the only way of doing it but believe me, in many cases there would be an alternative that you might be missing because you are more focused on the small picture.
What I usually do is that when I find myself trying to use Recursion, I try to visit some concepts and see if they could be alternatives.
Concepts like:
Using Loops?
Using Stacks and Queues?
Using Caching Maps (as in Dynamic Programming)?
Using Loops in Bottom Up manner (as in Dynamic Programming)?
If any or some of these help me achieve the same results, I usually do some performance and memory analysis to compare them to Recursion and eventually use the best between them.
In the next section, I will show you one of my favorite ways of replacing Recursion which is Using Loops in Bottom Up manner.
Using Loops in Bottom Up Manner
This concept is originating from Dynamic Programming concepts. Explaining Dynamic Programming is out of scope of this article. So, if you need to know more about it, you can search the internet and you would find tons of resources.
Let me introduce you to the Fibonacci series.
According to Wikipedia:
In mathematics, the Fibonacci numbers, commonly denoted Fn, form a sequence, the Fibonacci sequence, in which each number is the sum of the two preceding ones. The sequence commonly starts from 0 and 1, although some authors omit the initial terms and start the sequence from 1 and 1 or from 1 and 2. Starting from 0 and 1, the next few values in the sequence are:[1]
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
The Fibonacci numbers were first described in Indian mathematics,[2][3][4] as early as 200 BC in work by Pingala on enumerating possible patterns of Sanskrit poetry formed from syllables of two lengths. They are named after the Italian mathematician Leonardo of Pisa, later known as Fibonacci, who introduced the sequence to Western European mathematics in his 1202 book Liber Abaci.[5]
In other simple words:
Fn = Fn-1 + Fn-2
F0 = 0
F1 = 1
As you can see, the Fibonacci series, by definition, is based on Recursion. Therefore, it is so easy for our mind to formulate it as follows:
Would it work, yes for sure. However, as we proved before, it would be too expensive in terms of memory.
Furthermore, if we try to call FibonacciUsingRecursion passing in 80 as an input, the compiler would not get back at all. Why?
If you try to visualize the series as it is defined, you would end up with this:
Do you see it? do you notice that we have duplicate recursion trees which would be evaluated more than once. The fib(3) subtree would be evaluated 2 times. the fib(2) subtree would be evaluated 3 times, and so on…
Can you now imagine the number of duplicate subtrees for a fib(80) tree? It would be enormous.
Now, you might ask:
Now what? What can we do about it?
One of the ways of dealing with this is to think about the series definition in a different way.
Let’s analyze this:
fib(5) = fib(4) + fib(3)
fib(4) = fib(3) + fib(2)
fib(3) = fib(2) + fib(1)
fib(2) = fib(1) + fib(0) = 1 + 0 = 1
Do you notice that to get fib(5) we need to have fib(2) up to fib(4)? Then why don’t we do it this way?
We first start with fib(2) and keep going up till we reach fib(5). Actually, we can do it as follows:
fib(2) = 1 + 0 = 1
fib(3) = fib(2) + fib(1)
fib(4) = fib(3) + fib(2)
fib(5) = fib(4) + fib(3)
This means that we only need to start with two values; fib(0) = 0 and fib(1) = 1 and then we keep going up by adding the previous two and shift the two values.
So, following this, you can do something like this:
Now, if you run this, you would get the same result as the recursive solution but with much better performance and memory allocation.
Additionally, if we try to call FibonacciUsingLoops passing in 80 as an input, the compiler would be back with the result in no time.
Therefore, what we can learn from this is that replacing Recursion could make the impossible possible.
Final Thoughts
As you can see, Recursion is too expensive and sometimes it makes it impossible to execute some logic without hitting a barrier.
Therefore, you should always consider other alternatives before using Recursion.
Finally, I hope you enjoyed reading this article as I enjoyed writing it.