Recursive SQL with Common Table Expressions

This post intends to set out how to make recursive SQL queries in a simple and straightforward manner using Common Table Expressions (CTE).

WHAT ARE CTES?

As it is said in Microsoft SQL Server documentation a CTE (Common Table Expression) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing (important for the purposes in this post!) and can be referenced multiple times in the same query.

CTEs can be created in user-defined routines such as functions, stored procedures, views or triggers.

WHY USING  CTES?

From my view, CTEs are a great choice to improve readability and code maintenance as you can break up complex queries in small ones at the same time that you can reuse SQL code. In complex queries is a good tip to divide your SQL code into separate simple logical code blocks, especially when there are a lot of people in the team who needs to understand the code quickly before changing it.

Anyway, this post will show how to use CTEs to build Recursive SQL Queries in a simple cool way. I think this information is very useful for beginners (or not so beginners?) because I often check that CTEs are not used when necessary, for example by making wrong use of cursors instead. Eventually, this could penalise performance.

BACKGROUND

To show the examples I will use a simple (and typical) Employeestable, in which each employee has a manager, except the "supermanager" placed on the top of hierarchy. Besides, I will create another "Department" table to store the company's departments. The code to create the tables is shown below:

Copy Code
CREATE TABLE Departments (
    Id INT NOT NULL,
    Name NVARCHAR(50) NOT NULL,
    PRIMARY KEY(Id)
    );

    CREATE TABLE Employees (
    EmployeeId INT NOT NULL,
    FName NVARCHAR(50) NOT NULL,
    LName NVARCHAR(100) NOT NULL,
    PhoneNumber NVARCHAR(11),
    ManagerId INT,
    DepartmentId INT NOT NULL,
    Salary decimal(18,2) NOT NULL,
    HireDate DATETIME NOT NULL,
    PRIMARY KEY(EmployeeId),
    FOREIGN KEY (ManagerId) REFERENCES Employees(EmployeeId),
    FOREIGN KEY (DepartmentId) REFERENCES Departments(Id)
    );

It is important to note the existence of the foreign key on the same table in ManagerIdfield to reference the manager of each employee. This reference let us to build the necessary hierarchy for employee relations.

Here it is the scriptto load the previous shown tables:

Copy Code
INSERT INTO Departments
    ([Id], [Name])
    VALUES
    (1, 'Staff'),
    (2, 'Sales'),
    (3, 'Tech'),
    (4, 'HR');

    INSERT INTO Employees
    ([EmployeeId], [FName], [LName], [PhoneNumber], [ManagerId], [DepartmentId], [Salary], [HireDate])
    VALUES
        /* "Supermanager" (Note that ManagerId is null) */
    (1, 'Anthony', 'Wall', 1234567890, NULL, 1, 80000, '01-01-2012'),

    (2, 'John', 'Johnson', 2468101214, '1', 2, 50000, '23-03-2014'),
    (3, 'Michael', 'Williams', 1357911131, '1', 3, 45000, '12-05-2014'),
    (4, 'Johnathon', 'Smith', 1212121212, '1', 4, 35000, '24-07-2014'),

        /* Sales Department employees */
    (5, 'Melissa', 'Leon', 2568151214, '2', 2, 25000, '23-03-2015'),
    (6, 'John', 'Cobra', 2268151215, '2', 2, 25000, '23-03-2015'),
    (7, 'Michael', 'Goodman', 2757911131, '2', 2, 24000, '12-05-2015'),
    (8, 'Rick', 'Hart', 1212521512, '2', 2, 21000, '24-07-2015'),

        /* Tech Department employees */
    (9, 'Pedro', 'Leon', 2568151214, '3', 3, 25000, '23-03-2015'),
    (10, 'John', 'Cobra', 2268151215, '3', 3, 25000, '23-03-2015'),
    (11, 'Michael', 'Goodman', 2757911131, '3', 3, 24000, '12-05-2015'),
    (12, 'Rick', 'Hart', 1212521512, '3', 3, 21000, '24-07-2015'),

        /* HR Department employees */
    (13, 'David', 'Jordan', 4568151214, '4', 4, 25000, '23-03-2015'),
    (14, 'Nancy', 'Vacas', 3268151215, '4', 4, 25000, '23-03-2015')
    ;

So far, our company has 4 different departments and 15 employees. As you can see above, there are three leves of hierarchy being the "supermanager" the top one.

 

PLAYING WITH RECURSIVE CTES

The CTE query is itself made up of two SELECT statementsconnected with the UNION ALL operator. A recursive CTE query must contain at least two members (statements), connected by the UNION ALL, UNION, INTERSECT, or EXCEPT operator. In these examples, the first SELECT statement is the anchor member, and the second statement is the recursive member. All anchor members must precede the recursive members, and only the recursive members can reference the CTE itself. In addition, all members must return the same number of columns with corresponding data types.

At this point, we can start to play. Some typical queries to make would be:

 

"Children" employees from a given employee

The query will show all "children" employees from a given "EmployeeId". By default the query will not show data related to the employee used as an input param but this is not important in the example. If you want to show it, you just have to comment the last filter in where clause ("where EmployeeId != @EmployeeId").

As you can see the recursivity is achieved by means of a "recursive member" using "cteEmployees" as self-referencing member in order to find "children" records from the Employeestable. Recursive iteration will stop when results in "recursive member" doesn't exist.

The field "Level" will show the level of hierarchy just below the employee used as input param. So, this field for each record is incremented by one in each iteration. By default I will assign this "Level" field to "0" at the top of hierarchy results.

Copy Code
declare @EmployeeId int
    set @EmployeeId=1 /* Set @EmployeeId to look for descendants */

    /* Iterate to find 'descendant' employees for EmployeeId = @EmployeeId */
    ;with cteEmployees as
    (   /* Anchor Member */
    SELECT EmployeeId,E.FName, E.LName,ManagerId, DepartmentId, Salary,
    0 as [Level] /* Initial value for "Level field" */
    FROM Employees E
    where EmployeeId=@EmployeeId
    union all  
       /* Recursive Member. Query with auto-reference on cteEmployees.
          Recursive iteration stops when no results are found. */
    select E.EmployeeId,E.FName, E.LName,E.ManagerId, E.DepartmentId, E.Salary,  
        [Level]+1 /* Each iteration Level is incremented by one */
    from Employees E
    join cteEmployees on /* It looks for children in Employees table from cteEmployees */
    cteEmployees.EmployeeId=E.ManagerID 
    )
    select * from cteEmployees
    where EmployeeId != @EmployeeId  /* Filters @EmployeeId to show exclusively descendants */
    order by Level Asc

- Results using "EmployeeId=1" (Supermanager):

image

- Results using "EmployeeId=2" as "top" parent:

image

 

If line "where EmployeeId != @EmployeeId " in query is commented, results for the supermanager (it will be included) are:

image

 

"Parents" employees from a given employee

Copy Code
declare @EmployeeId int set @EmployeeId=14 /* Set @EmployeeId to look for "parents" */
    /* Iterate to find "parent" employees for @EmployeeId */
    ;with cteEmployees as
    (
    /* Anchor Member */
    SELECT EmployeeId,E.FName, E.LName,ManagerId, DepartmentId, Salary,
        0 as Level
    FROM Employees E
    where EmployeeId=@EmployeeId
    union all   /* Recursive Member */
    select E.EmployeeId,E.FName, E.LName,E.ManagerId, E.DepartmentId, E.Salary,
        [Level]+1
    from Employees E
    join cteEmployees on /* It looks for "parents" in Employees table from cteEmployees */
    cteEmployees.ManagerID=E.EmployeeId
    )
    /* Return results */
    select EmployeeId, FName, LName, ManagerId, DepartmentId, Salary
    from cteEmployees
    where EmployeeId != @EmployeeId /* Filters @EmployeeId to show exclusively "parents" */
    order by Level desc

- Results using "EmployeeId=14" to search "parent" employees:

image

- Results using "EmployeeId=1" (supermanager)

There are no results because "supermanager" is on the top of the hierarchy.


I hope this information has been useful !!

Add comment