When dealing with hierarchical data, we usually use recursion for constructing
the tree structure. It would be a performance problem if you query the tree node
data from the database every time the recursion happens. Also, if you work with
is very difficult to deal with the recursion because you have to determine when
the callback has finished execution. Luckily, PostgreSQL supports recursive
query using the
WITH RECURSIVE keyword. All the data structure can be returns
within a single query.
WITH queries in PostgreSQL
According to PostgreSQL document,
WITH queries provide a method for dealing
with large queries by creating temporary tables that exist only for that query.
This is a very simple example from Postgres document to illustrate the usage of
In the above example, the
WITH clause creates 2 temporary tables
regional_sales and top_region that exist only for the last
WITH RECURSIVE queries in PostgreSQL
WITH can be followed by
RECURSIVE modifier for the query to refer to its
own. This simple query will list all number from 1 to 100 using
Let me explain a bit about the example.
WITH RECURSIVE t(n) creates a
temporary tables named t with one column n. A
WITH RECURSIVE query
usually includes 2 parts: non-recursive term and recursive term. All the two
SELECT query should have exactly the same column as the t table.
PostgreSQL will first evaluate the non-recursive term (including discarding all
duplicate rows if you’re using just UNION not UNION ALL), put the result into a
temporary table and then evaluate the recursive term (which refers to itself)
and then append the result into the previous temporary table until it reaches
the stop condition or continues forever if there is no stop condition like the
above example. We need to use
LIMIT in the final
SELECT command to filter
WITH RECURSIVE for querying tree
Recursion is extremely useful for dealing with hierarchical data structure. Consider the pedigree tree example, I have 2 tables. One table stores all the family members information and the other table contains all the relations in the pedigree (i.e. the parent-children relation). The PedigreeRelations tables references to People table.
Here is the query to create these two tables
Insert some sample data
Now we come to the most chalenging part, querying the descendants with a given id. I have modified the search_graph query from PostfreSQL doc a bit to fit my data in 2 tables. Here is how the query look like
Remember this query is to find all the descendants of a given id, so you may
need to update the
r."parentId" = 1 in the 7th line to the id that you want to
The result of this query is all its descendants (not include itself). The result also contains a column (array type) holding the path from the given node to the node in the current row and a column carries the depth of the current node compare to the node we passed in before.
If you have a huge amount of data and you just want to query all descendant nodes within a specific depth level, uncomment
-- AND depth < 2 in the 17th line and change 2 to whatever depth
level that you want.