September 6th, 2023

Detecting whether a tree-like data structure contains a cycle

Suppose you are given a tree-like data structure, in the sense that you are given a root node, and each node has children. You want to validate that the data structure indeed represents a tree, with no cycles.

You can actually solve this problem by combining two things you already know how to do.

First, you can walk the tree and generate elements. If the necessary node-traversal methods are available, You can use one of the iterative tree-walking algorithms. Otherwise, you can use a traditional depth-first search with a stack. This converts the tree-like structure into a linear list.

Next, you can use Floyd’s two-pointer algorithm for finding a cycle in a linked list. This will tell you whether your tree loops back on itself.

Bingo, problem solved by combining two things you already know.

Bonus chatter: If you have to manage an explicit stack, then the memory usage can reach twice the number of nodes in the worst case. Alternatively, you can build a hash table of nodes that have been visited so far, and stop if you ever revisit a node before reaching the end of the tree. This caps the memory usage to the number of nodes.

Bonus bonus chatter: If you know the total number N of nodes, say, say because it’s recorded in the file format, or you can count them while you load the file, then you don’t need to use Floyd’s two-pointer cycle-finding algorithm. You can just walk through the tree until you either (1) reach the end, or (2) have walked through N nodes. Because once you reach node N + 1, you know that you must have revisited a node due to the pigeonhole principle, and therefore found a cycle.

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

2 comments

Discussion is closed. Login to edit/delete existing comments.

  • Marek Knápek

    Or you can leverage the fact that most (all?) data structures are aligned in memory. Tree node will be always allocated at address ending with one or more (binary) zeros. Meaning child pointers will have zeros at the least significant end. Meaning you can use one of the tree walking algorithms marking the pointers you followed by putting one into them. If you ever find child pointer with one in it, you have cycle/graph. If...

    Read more
  • Kārlis Gaņģis

    The more proper approach for finding cycles (and one that can properly tell where is the cycle, not just that there is one) is Tarjan’s strongly connected components algorithm