Introduction

When dealing with hierarchical data, we usually use recursion for constructing the tree structure. It would be a performance problem if you query the tree node data from the database every time the recursion happens. Also, if you work with languages which are heavily based on callback like Javascript, you will find it is very difficult to deal with the recursion because you have to determine when the callback has finished execution. Luckily, PostgreSQL supports recursive query using the WITH RECURSIVE keyword. All the data structure can be returns within a single query.

WITH queries in PostgreSQL

According to PostgreSQL document, WITH queries provide a method for dealing with large queries by creating temporary tables that exist only for that query. This is a very simple example from Postgres document to illustrate the usage of WITH queries.

In the above example, the WITH clause creates 2 temporary tables regional_sales and top_region that exist only for the last SELECT command.

WITH RECURSIVE queries in PostgreSQL

WITH can be followed by RECURSIVE modifier for the query to refer to its own. This simple query will list all number from 1 to 100 using WITH RECURSIVE

Let me explain a bit about the example. WITH RECURSIVE t(n) creates a temporary tables named t with one column n. A WITH RECURSIVE query usually includes 2 parts: non-recursive term and recursive term. All the two SELECT query should have exactly the same column as the t table. PostgreSQL will first evaluate the non-recursive term (including discarding all duplicate rows if you’re using just UNION not UNION ALL), put the result into a temporary table and then evaluate the recursive term (which refers to itself) and then append the result into the previous temporary table until it reaches the stop condition or continues forever if there is no stop condition like the above example. We need to use LIMIT in the final SELECT command to filter the result.

WITH RECURSIVE for querying tree

Recursion is extremely useful for dealing with hierarchical data structure. Consider the pedigree tree example, I have 2 tables. One table stores all the family members information and the other table contains all the relations in the pedigree (i.e. the parent-children relation). The PedigreeRelations tables references to People table.

Here is the query to create these two tables

Insert some sample data

Now we come to the most chalenging part, querying the descendants with a given id. I have modified the search_graph query from PostfreSQL doc a bit to fit my data in 2 tables. Here is how the query look like

Remember this query is to find all the descendants of a given id, so you may need to update the r."parentId" = 1 in the 7th line to the id that you want to find.
The result of this query is all its descendants (not include itself). The result also contains a column (array type) holding the path from the given node to the node in the current row and a column carries the depth of the current node compare to the node we passed in before.
If you have a huge amount of data and you just want to query all descendant nodes within a specific depth level, uncomment -- AND depth < 2 in the 17th line and change 2 to whatever depth level that you want.