A tree-based introduction to backtracking

Feb 25, 2020 • Avik Das

A photo of a tree with many branches.

Backtracking is all about exploring the branches of a solution space. Photo by Brandon Green on Unsplash.

In my last post, I introduced the balanced parentheses problem. The solution required trying out multiple possibilities whenever there was a choice to be made, a solution I referred to as backtracking.

If you read my graphical introduction to dynamic programming, you’ll know I frequently bring up choice as a key concept when applying dynamic programming to problems. So how do backtracking and dynamic programming differ, and when do you apply one over the other? In this post, I’ll lay out some fundamental principles to answer those questions, then explore these principles with a few examples.

Backtracking has three fundamental traits:

Let’s see how these traits apply to the last variant of the balanced parentheses problem. First, the choice inherent to our problem was whether to treat a delimiter as an open delimiter or a close delimiter. Let’s take as an example the following delimiter pairs:

a -> A
A -> b
b -> B

We want to check if the following string is balanced:


Because the string is small, we can see the string is balanced, as followed, using height to represent nesting:

  A  b
 a    A
a      A

The choice occurs when the first A is encountered, and we want to know if we should use it to close the previous a or to start a new group. We can represent this choice as a tree, forking into two children whenever we have a choice. In this view, each node in the tree represents the current stack of delimiters that have yet to be closed.

The search tree for the balanced parentheses problem. There is a branch when the first "A" is encountered, as the character can be a close delimiter (top branch) or an open delimiter (bottom branch). Each branch has further divisions.

Visualizing the backtracking algorithm as a tree search.

In this way, the backtracking algorithm amounts to a depth-first search of the solution space. Technically, the search may be over a graph, as certain configurations may be visited multiple times. However, in this case, it’s more likely we’ll visit each configuration only once (a fact that’s detailed later in the post), making it more natural to view the algorithm as a tree search.

Exploring each choice recursively

Notice the first important trait of backtracking: one choice can lead to another, and the resulting choices must be explored recursively. For example, if we assume the first A is an open delimiter, we now face the choice of whether b is an open or close delimiter. This leads to another choice, which would also be explored fully, and so on.

Restoring state after each choice

Secondly, after exploring the subtree in which A is an close delimiter, we would find that subtree does not result in a balanced string. So, we would explore the other subtree in which A is an open delimiter. But before doing so, out program state would be the same as is was before the first subtree was explored: the first aa have been consumed and we’re deciding what to do with the first A.

"a" is popped off the stack, "b" is pushed onto the stack, and more stack mutations are done during the search. After that branch is explored, the stack mutations are undone, "b" is popped off the stack and "a" is pushed back onto the stack. Now that the stack is back to the original state, another branch can be explored.

Each time a character is popped off the stack, it has to be pushed back on before exploring a different branch. Similarly, every time a character is pushed onto the stack, it has to be popped off.

This illustrates the second trait of backtracking. Even though visiting the first subtree messed with the program state by pushing and popping items off its stack, the result after finishing the subtree exploration is as if nothing happened. In other words, each recursive tree exploration has the invariant of not changing the program state at the end of the exploration.

I see people get tripped up when they move from the iterative solution used for the second variant of the problem (a fixed set of delimiters) to a recursive solution for the third variant (user-provided delimiters). Because they are used to mutating a stack across iterations, they have trouble reconciling the mutation with recursion. But, if you maintain the invariant above, you can still mutate a single stack across recursive calls! Just make sure the stack is the same by the end of each function call!

Early termination

Let’s say the algorithm decides to try the following sequence of choices:

  1. The first A is an open delimiter. This leaves us with a stack consisting of aaA and the remainder of the string as bBbAA.

  2. The subsequent b is a close delimiter. This leaves the stack as aa (since the A just got closed) and the remainder of the string as BbAA.

But now, the algorithm is stuck because the following B can only be a close delimiter and it has nothing to close. At this point, the rest of the string doesn’t have to be processed, and the subtree starting with the sequence of decisions above no longer needs to be explored further.

The sequence of choices described earlier, which cannot continue when "B" cannot be matched up with a previous character.

Once the choices above have been made, no more exploration can be done on that branch.

In a similar vein, you also terminate once you’ve found one valid solution, as you don’t need to keep searching the rest of the solution space.

Terminating the exploration of part of the solution space is the third trait of backtracking, and it can lead to substantial performance gains.

Backtracking and dynamic programming

Given these principles, how to backtracking and dynamic programming relate? It turns out both are related, and certain backtracking problems will use dynamic programming in their implementation. But the two are still separate.

Trial-and-error versus being exhaustive

Backtracking is typically used for a search problem, in which you want to find a single solution that works, out of many possible candidates. For example, in the balanced parentheses problem, it’s sufficient to find one balanced configuration, even if there are multiple. Furthermore, early termination means you don’t explore all the smaller subproblems if you can eliminate them early on.

In the bottom-up approach to dynamic programming, you start with smaller subproblems, solving them early so they’re available when you need them later. This is useful when you’ll have to solve all the subproblems eventually, like in the seam carving example I discussed.

An animation that shows the dependencies for each pixel as visited in the seam carving problem. Each cell is highlighted, going left to right, row by row from top to bottom.

Because the seam carving problem deals with pixels, each pixel's subproblem has to be solved at some point.

However, when searching for a single solution, you probably don’t want to search everything! This actually came up in the change making problem I discussed earlier, where we used the top-down approach (memoization) instead of the bottom-up approach to solve the problem.

A 2D table of subproblems, but some of the cells are dimmed out because they never get visited.

Because memoization is used for the change making problem, not all subproblems in the dependency graph are visited.

The change making problem still was concerned with finding the total number of valid configurations of coins. But, if the problem was to find one valid configuration, then backtracking with memoization would be a good fit.

Representing nodes in the dependency graph

With dynamic programming, the first step was to define a recurrence relation, a recursive function with an important property:

The function should be identifiable by some integer inputs. This will allow us to associate the inputs with already-computed results, as well as evaluate the function in a defined order.

The ability to associate subproblems with integers meant the node for each subproblem was easy to represent: one subproblem is one list of integers. Now, think about the inputs to each recursive call in the balanced parentheses problem. Not only is there the remainder of the string, but there’s a stack of open delimiters. How do you even represent these inputs as integers?!

(Okay, you can represent them as integers, but in practice, you don’t really want to.)

Combinations of stack states and remaining suffixes, with most of them crossed out. For example, it's not possible that stack contains just a single "a" and the remainder of the string is "AbBAA", even though it's part of the search space.

Even though many combinations of stack states and remaining suffixes are part of the search space, most of them are not actually ever encountered in the search.

So backtracking without dynamic programming is useful when either:

In these cases, you forgo the memoization or the table structure that’s integral to dynamic programming.

Backtracking is a useful technique for solving search problems, with a few fundamental properties:

These properties can be compatible with dynamic programming, and indeed, dynamic programming can be a tool to implement a backtracking algorithm. However, the two are separate and are used for different classes of problems.

In later posts, I plan to visit some more complicated backtracking problems to see how they utilize the properties above.