Tuesday, 27 November 2012 at 12:18 pm

Recursion is the most common way to traverse a tree data structure. However, there are language specific limitations to recursion. Python cannot recurse at depths more than a set system limit (usually around 1000). Recursion also can potentially use up a lot of memory storing the temporary depths. An iterative approach is possible with infite loops for tree traversal, but it is often longer to write and requires more conditional checks. 

Here I'll go through the basics of using recursion and iteration for tree traversal in python.


Data structure 

We'll use this simple tree structure as our data for the rest of this post:

obj = {'id':'nodeA','children':[{'id':'nodeB','children':[{'id':'nodeD','children':[]},{'id':'nodeE','children':[]}]},{'id':'nodeC','children':[]}]}

Here is the same object with indentation for legibility:

obj = {'id':'nodeA','children':[

Here is a rough scheme of what the tree looks like:

nodeA --> nodeB --> nodeD, nodeE
--> nodeC

NodeA has two child nodes, B and C. Node B has two child nodes, D and E. 


A recursive function is basically just a function that calls itself. For example, let's say we want to find the sum of numbers from 1 to a number, N. If N = 5. We would get the sum of: 1 + 2 + 3 + 4 + 5 = 15. Using loops (iteration), we can easily do this:

def getSumOfRange(N):
sum = 0
for i in range(N + 1):
sum += i
return sum

We can also use recursion and do this:

def recursiveGetSum(N):
if N > 0:
return N + recursiveGetSum(N-1)
return N 

If we were able to see how python interprets this recursive function for N = 5 at each function call, we would see this:

5 + recursiveGetSum(4)
5 + 4 + recursiveGetSum(3)
5 + 4 + 3 + recursiveGetSum(2)
5 + 4 + 3 + 2 + recursiveGetSum(1)
5 + 4 + 3 + 2 + 1 + recursiveGetSum(0)
5 + 4 + 3 + 2 + 1 + 0

The number of recursion 'depths' is 5. The mathematical operation python performs after the last depth is essentially:

5 + (4 + (3 + (2 + (1 + 0))))

Recursion uses more memory. At each recursive depth in the example above, all numbers calculated previously is held in memory. Why use recursion, when the iterative loop method works just as well (and often faster).

The simple answer is that recursion is a more elegant solution. Iterative methods usually will require more logical statements to ensure it is working properly. Some problems are just simpler with recursion. This is especially true for tree traversal

If we want to get all the children of a node on the tree structure introduced previously, we can use a recursive function like so:

def recursiveChildren(node):
	results = [node['id']]
	if len(node['children']) > 0:
		for child in node['children']:
	return results

The iterative version of this function is:

def iterativeChildren(nodes):
	results = []
	while 1:
		newNodes = []
		if len(nodes) == 0:
		for node in nodes:
			if len(node['children']) > 0:
				for child in node['children']:
		nodes = newNodes
	return results

Notice the use of an infinite loop and multiple conditionals. This function also requires the user to supply it with an array of nodes, even if there is only one node we are interested in.

Calling both functions with the tree object:

print recursiveChildren(obj)
print iterativeChildren([obj])
['nodeA', 'nodeB', 'nodeD', 'nodeE', 'nodeC']
['nodeA', 'nodeB', 'nodeD', 'nodeE', 'nodeC']


Unfortunately, python does not seem to implement a tail recursion optimization which would solve the problem of depth limitation and memory usage. However, we should use the algorithms best suited for the problem. Should you be using recursion for problems that'll take more than 1000 recursive depths?