Recursive Common Table Expressions for Hierarchical Data Queries in SQL

Recursive Common Table Expressions (CTEs) for Hierarchical Data Queries

Introduction

Hierarchical data queries are common in various domains, including organizational charts, family trees, and genealogy. In these scenarios, it’s essential to retrieve not only the immediate children but also the nested children of every parent. This problem can be solved using Recursive Common Table Expressions (CTEs) in SQL.

Problem Statement

Given a table with a parent-child relationship, we want to query all users at each level of nesting, including their parent and child IDs.

Example Data

The following example demonstrates the data structure:

UserIDParentID
1NULL
21
31
42
52
65
76
86
9NULL

Solution

We can solve this problem using a recursive CTE. The approach involves two main steps:

  1. Identify the root nodes: Select all rows with ParentID as NULL, which represent the top-level parents.
  2. Iteratively expand the hierarchy: Use the union all operator to combine the results of the previous step with the nested children.

Step-by-Step Solution

Identify the Root Nodes

First, we select all rows where the ParentID is NULL, which represent the root nodes:

WITH cte AS (
  SELECT UserID rootid, UserID, ParentID FROM users 
  UNION ALL
  SELECT cte.rootid rootid, users.UserID, users.ParentID FROM users
  INNER JOIN cte ON users.ParentID = cte.UserID
)
SELECT rootid parentid, UserID from cte
ORDER BY rootid, UserID;

Explaination

  1. We start by creating a CTE named cte.
  2. Inside the CTE, we define two recursive queries:
    • The first query selects all rows from the users table where ParentID is NULL, and assigns it as the root node (rootid). It also includes the UserID and ParentID columns for later use.
    • The second query joins the cte CTE with the users table on the condition that the ParentID in the users table matches the UserID in the cte.
  3. We then select all rows from the cte CTE, ordering them by both the rootid and UserID columns to produce the desired hierarchical structure.

Option: Limiting Recursion

To prevent an infinite recursion loop, we can set a limit on the maximum number of recursions using the OPTION (MAXRECURSION 0) clause. This ensures that the query does not go beyond a certain depth in the hierarchy:

WITH cte AS (
  SELECT UserID rootid, UserID, ParentID FROM users 
  UNION ALL
  SELECT cte.rootid rootid, users.UserID, users.ParentID FROM users
  INNER JOIN cte ON users.ParentID = cte.UserID
)
SELECT rootid parentid, UserID from cte
ORDER BY rootid, UserID;
OPTION (MAXRECURSION 0);

Example Output

The final output will contain all users at each level of nesting, including their parent and child IDs:

ParentIDUserID
11
12
13
14
15
16
17
18
22
24
25
26
27
28
33
44
55
56
57
58
66
67
68
77
88
99

Conclusion

Recursive CTEs are a powerful tool for solving hierarchical data queries. By identifying the root nodes and iteratively expanding the hierarchy, we can retrieve all users at each level of nesting, including their parent and child IDs. This approach allows us to efficiently process complex data structures and provides a flexible solution for various use cases.

Further Reading


Last modified on 2025-05-04